Consolidation Pass — from drills to reusable package

Context

After completing Levels 1-5 of the math curriculum (probability → statistics → linear algebra → calculus → stochastic calculus) and extracting techniques from Halls-Moore’s Successful Algorithmic Trading, we had:

Seven one-off drill scripts that each rewrote the same yfinance / pct_change / dropna boilerplate
No out-of-sample discipline — we were evaluating on the same window we calibrated on
No shared performance reporting
No formal bias audit on any strategy
No event-driven backtester architecture, which will be required for the prediction-market track

The consolidation pass fixes all five of those before we start building actual strategies.

What got built

A new Python package autoinv installed in editable mode under 01-projects/automated-investing/autoinv/, along with a pytest suite and a full pipeline demo script.

Package structure:

autoinv/
├── __init__.py       # version + module overview
├── README.md         # quickstart and design principles
├── data.py           # yfinance wrapper, returns, lagged features, Fama-French
├── stats.py          # normality tests, t-MLE, factor regression, permutation
├── validation.py     # TimeSeriesSplit helper, score_walk_forward, BiasAudit
├── metrics.py        # Sharpe, drawdown decomposition, Brier score, strategy_report
├── portfolio.py      # markowitz_with_cost, efficient_frontier, pca_factors
├── pricing.py        # black_scholes, greeks, monte_carlo_call, put_call_parity
└── engine.py         # event-driven backtester skeleton

tests/
├── test_stats.py     # 6 tests
├── test_validation.py# 4 tests
├── test_metrics.py   # 7 tests
├── test_portfolio.py # 5 tests
├── test_pricing.py   # 5 tests
└── test_engine.py    # 3 tests

scripts/
└── demo_buy_after_up_day_v2.py  # full pipeline in ~100 lines using the package

pyproject.toml  # hatch build, autoinv>=0.1.0 editable install

Test status: 30/30 passing.

Key design decisions

1. Time-series-aware cross-validation is the ONLY kind we expose

From the Halls-Moore gotcha (Ch 15, p. 275) proven out in level-2b-time-series-cv: shuffled k-fold leaks future data into training, which inflates measured accuracy and can flip a worthless model into one that “looks like” it has a small edge.

autoinv.validation.timeseries_cv() is the only CV constructor the package exposes. There is intentionally no shuffled-KFold helper. If someone using the package wants shuffled CV they have to reach around us into sklearn directly — making it a deliberate violation rather than an accident.

2. Every performance number is reported with a baseline

The score_walk_forward helper returns a CVComparison dataclass that includes the majority-class baseline and the lift over it — so the reviewer can never accidentally celebrate a 55% classifier that’s worse than always-predict-up.

metrics.strategy_report produces a one-shot summary including max drawdown, duration, and time-to-recover. The demo compares against buy-and-hold explicitly.

3. Bias audit as a first-class object

validation.BiasAudit is a dataclass with four flags (optimisation / look-ahead / survivorship / cognitive) and a passed property that requires all four to be False. It has a .report() method that prints a PASS/FAIL checklist. The demo pipeline runs this as the final gate.

This turns the Halls-Moore Ch 3 checklist from a thing to remember into a thing that’s forced on every strategy by the type system. If you don’t fill in the audit, the strategy doesn’t ship.

4. Event-driven backtester is a skeleton, not a platform

engine.py is Halls-Moore Ch 13’s architecture in ~200 lines: abstract Strategy / Portfolio / ExecutionHandler bases, a concrete HistoricalCSVDataHandler, SinglePositionPortfolio, and SimulatedExecutionHandler (with commission + slippage), plus a Backtest orchestrator.

Differences from Halls-Moore:

Uses collections.deque instead of queue.Queue (single-threaded, no need for thread-safe primitive)
Fewer classes, simpler event dispatch
Explicitly labeled as a research skeleton with a NOT-PRODUCTION warning in the docstring

When we hit the PM track, the same Backtest orchestrator plugs into a Polymarket CLOB WebSocket DataHandler without changing the strategy or portfolio code. That’s the whole point of the event-driven decomposition.

5. No shrinkage estimator or Black-Litterman prior yet

The L3 Markowitz drill showed that sample means are a terrible expected-return estimate (JNJ 55.6% “annualized expected”). The fix is Ledoit-Wolf shrinkage on the covariance and Black-Litterman posteriors for the means. We haven’t built those yet — they’re coming when the first real strategy needs them. For now markowitz_with_cost is exposed with a loud comment in the docstring about the estimation-error trap.

The demo script — proof the abstraction works

../scripts/demo_buy_after_up_day_v2 rebuilds the L2 “buy after up day” permutation-test drill using only autoinv imports. It:

Pulls data (data.get_returns, data.get_prices)
Runs a normality audit (stats.normality_report, stats.fit_student_t)
Runs the permutation test (stats.permutation_test)
Runs the strategy through the event-driven backtester (engine.Backtest)
Reports performance (metrics.strategy_report)
Compares to buy-and-hold baseline
Runs the four-bias audit (validation.BiasAudit)

Full pipeline in 100 lines of strategy-logic-heavy code, almost all of which is the one custom piece (the BuyAfterUpDayStrategy class). The data plumbing and metrics plumbing are gone.

What the demo proved:

The permutation test correctly fails the strategy (5.7 percentile, p=0.94)
Event-driven backtest ran 1569 bars with 793 fills
Strategy Sharpe 0.110 vs buy-and-hold Sharpe 0.717 — the backtest confirms the permutation test’s warning and quantifies just how bad it is after costs
Max drawdown: strategy -25.11% vs buy-and-hold -33.72% (the one silver lining — flat periods dampen drawdowns)
Annualized return: strategy +1.48% vs buy-and-hold +15.85%
Bias audit correctly flagged cognitive bias (a strategy that underperforms buy-and-hold this badly is not one I’d stomach)

This is the template for every future strategy. New ideas get a script that looks structurally identical to this demo but plugs in different strategy logic.

What the consolidation pass did NOT do (deferred)

Shrinkage covariance and Black-Litterman priors — deferred until the first real strategy needs them
Walk-forward optimization (rolling-window re-calibration) — wait for the PM track
Full pairs-trading with CADF / Hurst — Halls-Moore Ch 9 techniques, add when we need them
Kelly criterion and position sizing — Halls-Moore Ch 12, add when the first profitable strategy is ready to size
Alpha Vantage / Polygon data sources — stick with yfinance until it’s the bottleneck
MySQL securities master — overkill at our scale

What this unlocks

The package is now the foundation for:

Starting the Prediction Markets track (the PM levels) — the event-driven engine is ready to accept a Polymarket CLOB WebSocket DataHandler
Rebuilding the L3 Markowitz and L4 portfolio drills using the package API (optional cleanup)
New strategy experiments — template script is ~100 lines, no more copy-paste boilerplate
Running strategies through proper walk-forward evaluation rather than in-sample-only scoring

Next actions

Start PM1 — build a Polymarket CLOB WebSocket DataHandler that plugs into the existing engine.Backtest orchestrator. Monitor tool is the right primitive here (see ../../../06-reference/2026-04-10-claude-code-monitor-tool).
Build metrics.plot_equity_curve and plot_drawdown_chart helpers so the demo script produces the same plots the one-off drills did
Write a level-6-consolidation.md concept doc if we want to cite this pass as a curriculum milestone
Consider a strategies/ subfolder for concrete strategy classes (Moving Average Crossover, Pairs, etc.) once we have more than 2-3 of them

Infrastructure plan — this consolidation was explicitly proposed there
../../../06-reference/2026-04-10-halls-moore-algo-trading — the source of most architectural decisions
../../../06-reference/2026-04-10-gemchange-quant-from-scratch — the math foundation
../../../06-reference/2026-04-10-gemchange-simulate-like-quant-desk — the PM track blueprint this package will support