Level 1 — Probability Drills
First two drills from the quant-from-scratch roadmap Level 1 homework. These are vibe checks — the point isn’t the math (it’s trivial) but confirming the dev environment is wired up correctly and that I can produce reproducible, plot-backed outputs.
Drill 1 — Law of Large Numbers via coin flips
Script: ../scripts/level_1_coin_flip.py
Setup: 10,000 fair coin flips, seed 42. Compute running average after each flip. Expected behavior: early values swing wildly, then asymptotically converge to 0.5 (the true probability).
Result:
Final running average: 0.4985 (true: 0.5000)
Absolute error: 0.0015
Absolute error of 0.0015 after 10K flips. This matches the CLT expectation — the standard error of the sample proportion at p=0.5, n=10000 is sqrt(0.5*0.5/10000) ≈ 0.005, so 0.0015 is well within one SE.
Plot: outputs/level_1_lln.png — running average vs flip count, with red dashed line at 0.5. Log-scale x-axis to show the early swings clearly.
Takeaway: vibe check passed. numpy + matplotlib + the venv all wired correctly.
Drill 2 — Bayesian updater
Script: ../scripts/level_1_bayesian_updater.py
Setup: Assume a biased coin with true p=0.7. Start with a uniform prior over p ∈ (0, 1) (grid of 500 points). Flip the coin 50 times, update the posterior sequentially via Bayes’ rule: posterior ∝ prior × likelihood, where the likelihood is p for heads and (1-p) for tails.
Result:
True p: 0.7
Observed heads: 32/50
MAP estimate: 0.6390
Sample mean: 0.6400
Why MAP matches sample mean: with a uniform prior, the posterior is proportional to the likelihood, so the MAP equals the MLE. The MLE of a Bernoulli parameter from n flips with k heads is k/n = 32/50 = 0.64. The 0.639 MAP is essentially the same number (tiny grid quantization). This is the first principles-check that the updater is correct — if I’d written it wrong, MAP and sample mean wouldn’t agree.
Plot: outputs/level_1_bayesian.png — five posterior curves at {0, 1, 5, 10, 25, 50} flips, showing the distribution concentrating from flat uniform to a tight peak near the true value. Red dashed line at p=0.7 (the truth).
Takeaway: vibe check passed. The posterior concentrates as expected and the relationship to the frequentist MLE is correct.
Observations worth logging
-
Estimation error is real even on trivial problems. Even with 50 flips of a biased coin, the sample mean was 0.64 not 0.70. Our posterior’s MAP was 0.639. This 6-point gap on trivial data is a preview of the estimation-error warning from the roadmap article — on real market data the gap between true and estimated parameters is the thing that kills strategies.
-
Grid-based Bayesian updating is fine at L1, but doesn’t scale. At L3+ we’ll need MCMC (PyMC / Stan) or variational methods for continuous parameter spaces. Grid works for 1 parameter; dies at 3+.
-
The reproducibility setup is working. Both scripts use seeded RNG and write plots to a deterministic location. If I re-run tomorrow, I get the same numbers. This matters a lot for the “first 10 strategies are noise” problem — we have to be able to trust that a backtest isn’t changing between runs because of randomness.
Gate check → proceed to Level 2?
Yes. Both L1 drills pass:
- LLN converges within one SE of truth ✓
- Bayesian updater produces MAP matching frequentist MLE on uniform prior ✓
- Plots render, venv is reproducible ✓
Next action (Level 2): pull real stock returns via yfinance, test normality (expect to fail — fat tails), fit a Student-t distribution via MLE, and run a Fama-French 3-factor regression. The first L2 drill is the one where “everything looks like signal until you remember noise looks like signal too.”