06-reference

halls moore algo trading

Thu Apr 09 2026 20:00:00 GMT-0400 (Eastern Daylight Time) ·reference ·source: Successful Algorithmic Trading (QuantStart course) ·by Michael L. Halls-Moore

Successful Algorithmic Trading — Halls-Moore (QuantStart)

Purchased course material, 284 pages, 15 chapters across 6 parts, plus a full Python codebase with 10 chapter folders. This is Ray Data Co’s reference note extracting techniques and architecture decisions from the book and mapping them against our existing Automated Investing work. Summaries are in my own words; direct quotes are ≤15 words and in quotation marks. The book is for personal learning — not republished or commercialized.

Why it’s in the vault

Two gaps in our current Automated Investing work that I flagged at the L4 gate:

  1. No out-of-sample discipline — every drill so far calibrates and evaluates on the same window
  2. No event-driven backtester architecture — we’d have to reinvent one for PM3+ anyway

This book addresses both directly. It also has chapters on time series analysis (ADF / Hurst / cointegration → pairs trading), performance measurement (drawdown decomposition), and risk / money management (Kelly criterion nuances) that will feed later levels.

Cross-references:

Book structure (mapped to our progress)

PartChTopicPagesValue for us
I — Introducing Algorithmic Trading1Intro3-10Skip
2What Is Algorithmic Trading?11-18Skim — retail trader framing
II — Trading Systems3Successful Backtesting21-29★ Critical — backtesting biases
4Automated Execution31-41Useful for PM5, skip for now
5Sourcing Strategy Ideas43-55Skim
III — Data Platform Development6Financial Data Storage59-79Useful when we outgrow yfinance → MySQL securities master
7Processing Financial Data81-102Continuous futures handling
IV — Modelling8Statistical Learning105-112Review of ML fundamentals
9Time Series Analysis113-130★ ADF, Hurst, Cointegration — pairs trading toolkit
10Forecasting131-148Classifier comparison
V — Performance and Risk11Performance Measurement151-166★ Drawdown decomposition, risk/reward beyond Sharpe
12Risk and Money Management167-178★ Kelly criterion nuances, VaR
VI — Automated Trading13Event-Driven Engine181-232★ Full backtester architecture
14Strategy Implementations233-262MAC, S&P forecast, pairs trade
15Strategy Optimisation263-284★ Cross-validation, grid search, overfitting

The five starred chapters are the ones I’ve deep-read today. Others are cited briefly or scheduled for later levels.

Chapter 3 — Successful Backtesting

Halls-Moore identifies four biases that inflate backtest performance, each of which applies directly to our existing drills.

3.1 Optimisation bias (“curve fitting” / “data snooping”)

The book calls this “probably the most insidious of all backtest biases” (p. 23). It’s the process of tuning parameters until the backtest looks great; the strategy then fails in live trading because you’ve fit noise.

Mitigations Halls-Moore recommends:

Application to our L3 Markowitz work: our 1.70 Sharpe “max-Sharpe portfolio” is a textbook optimisation-bias result. We used sample means computed on the full 2022-2026 window as the expected-return estimates — that’s calibrating parameters on the whole dataset, exactly what the book warns against. The JNJ 55.6% annualized “expected return” is noise we fit, not signal.

3.2 Look-ahead bias

Accidentally using future data in a point-in-time simulation. Three common sources Halls-Moore highlights:

Application: our Fama-French L2 regression is technically look-ahead-biased if we were to call the coefficients “the factor loadings” and trade on them. Any regression-calibrated strategy has to use expanding-window or rolling-window coefficients, not full-sample coefficients.

3.3 Survivorship bias

Only testing on assets that still exist today, which inflates returns because bankrupt / delisted stocks are silently excluded.

Mitigations:

Application: our L3 PCA used a hard-coded list of 100 current large-caps. Every one of them is a survivor — companies that were large-caps in 2020 and blew up (e.g., Bed Bath & Beyond style stories) are silently excluded. This is a known limitation of our current data layer.

3.4 Cognitive bias

The psychological failure to endure drawdowns in live trading that looked “acceptable” in a backtest. If a strategy shows a 25% drawdown over 4 months in backtest, you should expect the same live, and you need to be able to stomach it without bailing at the bottom.

Halls-Moore’s framing: “even though the strategy is algorithmic in nature, psychological factors can still have a heavy influence on profitability” (p. 25). This is why the discipline of the prior three biases matters operationally — if you trust the backtest, you can hold through the drawdown; if you secretly don’t, you’ll bail at the worst possible time.

3.5 Transaction costs

Three categories:

Our L4 Markowitz drill models the first via an L1 turnover penalty but doesn’t model slippage or market impact. At our projected trading size this is probably fine.

Chapter 15 — Strategy Optimisation

Overfitting and the bias-variance dilemma

Halls-Moore frames strategy optimisation explicitly as a bias-variance problem:

Overfitting is when you drive training error to zero by increasing model flexibility but destroy out-of-sample performance. This is Level 2’s “first 10 strategies are noise” warning in a different vocabulary.

Model selection via train/test split

Simplest form: hold out a test set (20-50%), train on the rest, evaluate only on the test set. Halls-Moore uses sklearn’s train_test_split with random_state=42 and test_size=0.8 (meaning 20% train, 80% test).

k-fold cross-validation

Partition data into k equal chunks, train on k-1, test on the held-out chunk, repeat k times, average the performance. Standard practice uses k=10. sklearn’s KFold makes it a one-liner.

CRITICAL GOTCHA — the book explicitly flags this

On p. 275, after demonstrating k-fold with shuffle=True on SPY time-series data, Halls-Moore writes (italicized in the original):

“Note that technically it is not appropriate to use simple cross-validation techniques on temporally ordered data (i.e. time-series). There are more sophisticated mechanisms for coping with autocorrelation in this fashion, but here we wanted to highlight the approach so we have used time series data for simplicity.”

This is the single most important thing in the chapter for us. Shuffled k-fold on time series creates look-ahead bias: a fold’s training set can contain timestamps that come after the test fold’s timestamps. The model gets to “peek into the future” through the shuffled ordering. Test accuracy is inflated as a result.

The correct tool is TimeSeriesSplit (sklearn) or walk-forward validation, where each train set only contains timestamps strictly earlier than the test set. We should bake this into our autoinv package from day one — never using shuffled k-fold on financial time series.

I’ll prove this out by running both approaches on our existing L2 “buy after up day” strategy and showing how the shuffled approach produces a more optimistic result than the walk-forward approach.

Grid search + cross-validation

sklearn.GridSearchCV takes a parameter grid and a cv scheme and searches the cartesian product. Halls-Moore example: RBF kernel SVM with C ∈ {1, 10, 100, 1000} and gamma ∈ {1e-3, 1e-4} = 8 parameter combinations × 10 folds = 80 model fits.

Computational cost warning: 5 parameters × 10 values each = 10^5 = 100,000 fits. Parallelism helps but isn’t a substitute for keeping parameter count low.

Strategy-level parameter grids

Section 15.3 demonstrates a pairs-trading strategy with three parameters (OLS lookback window, z-score entry, z-score exit), gridded as 3^3 = 27 combinations and evaluated via the event-driven backtester. Uses itertools.product for the cartesian product.

This is the pattern we’ll use once we’re running strategy-level backtests in the autoinv package.

Chapter 13 — Event-Driven Trading Engine (architecture sketch)

Halls-Moore argues for an event-driven architecture (rather than a vectorized one) because it more faithfully models the real trading environment and catches bugs that vectorized backtests hide.

The component decomposition (from the chapter TOC and code inspection):

ComponentResponsibility
EventBase class. Subclasses: MarketEvent, SignalEvent, OrderEvent, FillEvent
DataHandlerStreams market data (from CSV, API, or DB) onto an event queue, emitting MarketEvents
StrategyConsumes MarketEvents, applies signal logic, emits SignalEvents
PortfolioConsumes SignalEvents, applies position sizing and risk rules, emits OrderEvents
ExecutionHandlerConsumes OrderEvents, simulates (or, in live mode, submits) execution, emits FillEvents
BacktestThe orchestrator: holds an event queue, pumps events through the components in a loop until the data stream is exhausted

Why this matters: each component is independently testable and swappable. To switch from backtest to live trading, you only swap ExecutionHandler (e.g., Halls-Moore provides a SimulatedExecutionHandler and an IBExecutionHandler for Interactive Brokers). To switch from equities to Polymarket, you only swap DataHandler. Same strategy and portfolio code runs against both.

This is almost exactly what we need for the PM track. The Polymarket CLOB WebSocket subscriber becomes a custom DataHandler; the binary-contract Monte Carlo probability engine becomes a Strategy; the Kelly-sized position builder becomes a Portfolio; the CLOB order router becomes an ExecutionHandler. We’ll adopt this decomposition in the autoinv consolidation pass.

Note: Halls-Moore’s code uses Python’s queue.Queue as the event bus. For our use case (single-threaded, no concurrency) a simple collections.deque is lighter and faster. Small stylistic tweak for our rewrite.

Techniques we should adopt (and prove out)

Concrete items to lift into our project. These are my own decisions about what to implement, not direct copies of the book’s code:

  1. Time-series-aware cross-validation (TimeSeriesSplit / walk-forward) — address the book’s explicit gotcha. Drill built today: ../01-projects/automated-investing/experiments/level-2b-time-series-cv
  2. Sensitivity analysis plots for any parameter sweep. Smooth surface = robust, jagged = overfit. Add to the consolidation package as a utility.
  3. Four-bias audit checklist for every new strategy. Optimisation bias, look-ahead bias, survivorship bias, cognitive bias — each gets a yes/no answer before the strategy is allowed to run on real money.
  4. Drawdown decomposition from Ch 11 — not just max drawdown, but drawdown duration, recovery time, and visualizing the drawdown path. Add as part of the performance reporting utility.
  5. Event-driven component decomposition for the backtester — DataHandler / Strategy / Portfolio / ExecutionHandler as independently swappable components. Adopt for the consolidation pass and the PM track.
  6. Grid search with CV pattern from Ch 15 — use sklearn’s GridSearchCV with our own TimeSeriesSplit cv object (not the default shuffled KFold). This gives us hyperparameter optimization with honest out-of-sample scoring.
  7. Itertools.product for strategy-level parameter grids (lightweight, no sklearn dependency on the strategy side).
  8. SimulatedExecutionHandler with slippage and commission modeling — start simple (fixed bps), upgrade to volume-weighted when we hit PM4.

Techniques that don’t add value for us (overlap or not needed)

Open questions