01-projects / automated-investing / experiments

level 5 stochastic

Thu Apr 09 2026 20:00:00 GMT-0400 (Eastern Daylight Time) ·experiment-writeup ·status: complete

Level 5 — Stochastic Calculus Drills

The hardest level in the quant-from-scratch roadmap. This is the level where “data scientist who likes finance” becomes “quant.” The curriculum is normally 6-8 weeks of study; we’re speeding through the computational drills here and leaning on the author’s instinct that the key insights are:

  1. (dW_t)² = dt is first-order, not second-order, for random processes
  2. That single fact is what makes Itô calculus differ from ordinary calculus
  3. Black-Scholes falls out of Itô + a delta-hedging argument that cancels the dW terms
  4. The resulting option price is independent of the stock’s drift μ — the risk-neutral pricing result

The computational drills (Black-Scholes from scratch, Monte Carlo convergence, all five Greeks) are in ../scripts/level_5_black_scholes.py. The pen-and-paper derivations are written up below as prose.

Derivation 1 — Itô’s lemma on f(S) = ln(S) where S follows GBM

Setup. Let S follow geometric Brownian motion:

dS_t = μ S_t dt + σ S_t dW_t

where W is a standard Brownian motion. Let f(S) = ln(S). We want dX where X_t = f(S_t) = ln(S_t).

Ordinary calculus answer (wrong). Ordinary chain rule would say:

df = f'(S) dS = (1/S) · (μS dt + σS dW) = μ dt + σ dW

This gives X_t ~ N(ln(S_0) + μt, σ²t) → geometric mean returns equal arithmetic mean.

Itô’s lemma (correct). Because (dW_t)² = dt is first-order, we have to keep the second-order term in the Taylor expansion:

df = f'(S) dS + (1/2) f''(S) (dS)²

Compute the pieces:

Substitute:

df = (1/S)(μS dt + σS dW) + (1/2)(-1/S²)(σ²S² dt)
    = μ dt + σ dW − (σ²/2) dt
    = (μ − σ²/2) dt + σ dW

The extra −σ²/2 term is the “Itô correction.” Under GBM:

ln(S_T / S_0) ~ N((μ − σ²/2)T, σ²T)

which means the median terminal price is S_0 e^((μ−σ²/2)T) not S_0 e^(μT). The gap between arithmetic and geometric returns that every retail investor eventually gets surprised by — that’s this term. Vol is a tax on compound growth, and its tax rate is σ²/2.

This is the first place stochastic calculus bites. It’s also the place where “high vol, high return” strategies quietly hand back all their edge.

Derivation 2 — The Black-Scholes PDE from delta hedging

Setup. Let V(S, t) be the price of a European option on S, where S follows GBM as above. Construct a portfolio Π that is long 1 option and short Δ = ∂V/∂S shares of the underlying:

Π = V − (∂V/∂S) · S

Apply Itô’s lemma to V(S, t).

dV = (∂V/∂t) dt + (∂V/∂S) dS + (1/2)(∂²V/∂S²)(dS)²

Substituting dS = μS dt + σS dW and (dS)² = σ²S² dt:

dV = [(∂V/∂t) + μS(∂V/∂S) + (σ²S²/2)(∂²V/∂S²)] dt + σS(∂V/∂S) dW

Compute dΠ:

dΠ = dV − (∂V/∂S) dS
   = [(∂V/∂t) + μS(∂V/∂S) + (σ²S²/2)(∂²V/∂S²)] dt + σS(∂V/∂S) dW
     − (∂V/∂S)(μS dt + σS dW)
   = [(∂V/∂t) + (σ²S²/2)(∂²V/∂S²)] dt

Watch what happened. The dW terms cancelled perfectly. The μS dt terms cancelled. The portfolio is locally riskless: over an infinitesimal time step, dΠ has no random component at all.

Apply the no-arbitrage principle. A locally riskless portfolio must earn the risk-free rate:

dΠ = r Π dt = r [V − S(∂V/∂S)] dt

Equate and rearrange:

(∂V/∂t) + (σ²S²/2)(∂²V/∂S²) = rV − rS(∂V/∂S)
(∂V/∂t) + rS(∂V/∂S) + (σ²S²/2)(∂²V/∂S²) − rV = 0

This is the Black-Scholes PDE. The drift μ vanished entirely. Risk preferences don’t enter. You can price options “as if everyone is risk-neutral” — which is what the Monte Carlo drill below does, using drift r instead of μ.

For a European call with terminal condition V(S, T) = max(S − K, 0) the PDE solves to:

C = S · N(d₁) − K · e^(−rT) · N(d₂)
d₁ = [ln(S/K) + (r + σ²/2)T] / (σ√T)
d₂ = d₁ − σ√T

where N is the standard normal CDF. We use this directly in the script.

Drill 3 — Black-Scholes from scratch + Monte Carlo convergence

Parameters: S=100, K=105, T=1, r=0.05, σ=0.2. European call.

Analytical Black-Scholes: $8.021352. Put: $7.900442.

Put-call parity sanity check: C − P = 0.12091043, and S − K·e^(−rT) = 0.12091043. Difference: 0.00e+00. Machine precision. The BS formula is internally consistent.

Monte Carlo convergence (simulate under the risk-neutral measure with drift = r, not μ):

N samplesMC pricestd error|MC - BS|
1,0007.4910.4010.531
10,0007.9400.1320.082
100,0008.0060.0420.015
1,000,0008.0250.0130.003
5,000,0008.0280.0060.007

Empirical convergence slope: -0.565 (vs theoretical -0.500 from the CLT).

The CLT says standard error scales as O(N^(-1/2)). We log-log fit the error vs N and got a slope of -0.565, close to -0.5 within sampling noise. Monte Carlo converges to the analytical Black-Scholes price at the rate theory predicts. At 1M simulations we’re within 0.003 of the analytical answer. This is the “naive MC works for endpoint pricing” baseline — the article’s warning that this breaks on tail events is what importance sampling (see simulate-like-quant-desk) is designed to fix.

Drill 4 — All five Greeks, analytical and finite-difference

Computed each Greek two ways: (a) closed form from partial derivatives of the BS formula, (b) central finite differences on the BS function itself as a sanity check.

GreekAnalyticalFinite-diff|diff|
Delta (∂V/∂S)0.542228330.542228337.9e-11
Gamma (∂²V/∂S²)0.019835260.019834804.6e-07
Theta (∂V/∂t)-6.27712644-6.277126444.0e-09
Vega (∂V/∂σ)39.6705238139.670523791.6e-08
Rho (∂V/∂r)46.2014811246.201480694.3e-07

All Greeks match to at least 6 decimal places. Gamma has the largest finite-difference error because it’s a second derivative and numerical noise amplifies.

What each Greek tells you operationally:

Plot: outputs/level_5_bs_greeks.png — three panels: call price vs strike, delta vs strike (S-shaped from 0→1 as K drops below S), and gamma vs strike (bell curve peaking near ATM).

Gate check → ready for the Prediction Markets track

Yes. L5 drills all pass cleanly:

The five math levels of the curriculum are now complete. Math foundation is in place.

Proposed next step (the consolidation pass I flagged in the L4 reply): before starting the prediction-market track, extract the reusable pieces — data pulls, normality tests, regression, permutation test, portfolio optimizer, BS — into a shared autoinv Python package under 01-projects/automated-investing/autoinv/. That buys us reusable utilities, a place for unit tests, and train/test/rolling-window infrastructure that the prediction-market drills will need anyway.

After the consolidation pass: PM track kicks off with a Polymarket CLOB WebSocket subscriber (L1 of the 5-layer production stack) + a binary contract Monte Carlo simulator with Brier score calibration.