01-projects / automated-investing / experiments

level 1 probability

Thu Apr 09 2026 20:00:00 GMT-0400 (Eastern Daylight Time) ·experiment-writeup ·status: complete

Level 1 — Probability Drills

First two drills from the quant-from-scratch roadmap Level 1 homework. These are vibe checks — the point isn’t the math (it’s trivial) but confirming the dev environment is wired up correctly and that I can produce reproducible, plot-backed outputs.

Drill 1 — Law of Large Numbers via coin flips

Script: ../scripts/level_1_coin_flip.py

Setup: 10,000 fair coin flips, seed 42. Compute running average after each flip. Expected behavior: early values swing wildly, then asymptotically converge to 0.5 (the true probability).

Result:

Final running average: 0.4985 (true: 0.5000)
Absolute error: 0.0015

Absolute error of 0.0015 after 10K flips. This matches the CLT expectation — the standard error of the sample proportion at p=0.5, n=10000 is sqrt(0.5*0.5/10000) ≈ 0.005, so 0.0015 is well within one SE.

Plot: outputs/level_1_lln.png — running average vs flip count, with red dashed line at 0.5. Log-scale x-axis to show the early swings clearly.

Takeaway: vibe check passed. numpy + matplotlib + the venv all wired correctly.

Drill 2 — Bayesian updater

Script: ../scripts/level_1_bayesian_updater.py

Setup: Assume a biased coin with true p=0.7. Start with a uniform prior over p ∈ (0, 1) (grid of 500 points). Flip the coin 50 times, update the posterior sequentially via Bayes’ rule: posterior ∝ prior × likelihood, where the likelihood is p for heads and (1-p) for tails.

Result:

True p:          0.7
Observed heads:  32/50
MAP estimate:    0.6390
Sample mean:     0.6400

Why MAP matches sample mean: with a uniform prior, the posterior is proportional to the likelihood, so the MAP equals the MLE. The MLE of a Bernoulli parameter from n flips with k heads is k/n = 32/50 = 0.64. The 0.639 MAP is essentially the same number (tiny grid quantization). This is the first principles-check that the updater is correct — if I’d written it wrong, MAP and sample mean wouldn’t agree.

Plot: outputs/level_1_bayesian.png — five posterior curves at {0, 1, 5, 10, 25, 50} flips, showing the distribution concentrating from flat uniform to a tight peak near the true value. Red dashed line at p=0.7 (the truth).

Takeaway: vibe check passed. The posterior concentrates as expected and the relationship to the frequentist MLE is correct.

Observations worth logging

  1. Estimation error is real even on trivial problems. Even with 50 flips of a biased coin, the sample mean was 0.64 not 0.70. Our posterior’s MAP was 0.639. This 6-point gap on trivial data is a preview of the estimation-error warning from the roadmap article — on real market data the gap between true and estimated parameters is the thing that kills strategies.

  2. Grid-based Bayesian updating is fine at L1, but doesn’t scale. At L3+ we’ll need MCMC (PyMC / Stan) or variational methods for continuous parameter spaces. Grid works for 1 parameter; dies at 3+.

  3. The reproducibility setup is working. Both scripts use seeded RNG and write plots to a deterministic location. If I re-run tomorrow, I get the same numbers. This matters a lot for the “first 10 strategies are noise” problem — we have to be able to trust that a backtest isn’t changing between runs because of randomness.

Gate check → proceed to Level 2?

Yes. Both L1 drills pass:

Next action (Level 2): pull real stock returns via yfinance, test normality (expect to fail — fat tails), fit a Student-t distribution via MLE, and run a Fama-French 3-factor regression. The first L2 drill is the one where “everything looks like signal until you remember noise looks like signal too.”