How I’d Become a Quant If I Had to Start Over Tomorrow — @gemchange_ltd
Why this is in the vault
This is the roadmap article for the Automated Investing small bet. It lays out an 18-month, five-level curriculum for going from zero to employable-quant. The value for RDCO isn’t the career path itself — it’s that the same math is what any credible automated-investing system has to sit on top of. We can’t automate what we don’t understand, and this article is the clearest version of the dependency graph I’ve seen.
TL;DR
Quant trading is a math game, not a stock-picking game. Edge comes from statistical relationships and structural inefficiencies, not opinions. The author lays out a five-layer curriculum where each layer is a hard prerequisite for the next — skip a level and the ceiling on every level above it collapses.
Entry-level comp at top firms is $300K–$500K and AI/ML quant hiring grew 88% YoY in 2025. The math is the moat: AI can write the code, but it can’t substitute for knowing why the model is correct.
The five-level dependency graph
Each level has 3–8 weeks of recommended study time and homework. Total: ~18 months at 2 hours/day.
Level 1 — Probability (3–4 weeks)
The core question: what are the odds, and are they in my favor?
- Conditional thinking — raw probabilities are noisy; conditional probabilities are signal
- Bayes’ theorem — how you update conviction as new data arrives (Monte Carlo sampling in practice)
- Expected value and variance — EV is conviction, variance is risk, survive the variance and positive EV wins over time
- Textbook: Blitzstein & Hwang, Introduction to Probability (free PDF, Harvard) — Ch 1–6
- Code drills: simulate 10K coin flips for law of large numbers; implement a Bayesian updater
Level 2 — Statistics (4–5 weeks)
Statistics is the BS detector. Per the author, “most of what looks like NOT A BS is actually NOISY BS” — your first 10 strategies will all be noise, and beginners massively overestimate how much real signal they’ve found.
- Hypothesis testing — build a null, compute test statistic, watch out for the multiple comparisons problem (Bonferroni / Benjamini-Hochberg)
- Linear regression — decomposing returns against risk factors; the intercept α is your real edge, and if α vanishes after controlling for factors, your “strategy” is disguised market beta
- Use Newey-West standard errors — financial data has autocorrelation and heteroskedasticity, default OLS SEs are wrong
- Maximum Likelihood Estimation — how every model in finance gets calibrated; when a desk says “calibrating” they mean MLE
- Textbook: Wasserman, All of Statistics — Ch 1–13
- Code drills: fit Student-t to real returns via MLE, Fama-French 3-factor regression, permutation test
Level 3 — Linear Algebra (4–6 weeks)
The machinery behind portfolio construction, PCA, neural nets, factor models.
- Covariance matrices — portfolio variance is a single quadratic form
- Eigenvalues — on a 500-stock universe, the first 5 eigenvectors explain ~70% of variance; the rest is noise. This is the foundation of factor investing and dimensionality reduction.
- Textbook / video: Strang, Introduction to Linear Algebra + MIT 18.06 lectures (the author calls watching all of 18.06 “non-negotiable”)
- Code drills: PCA decomposition of S&P 500, Markowitz mean-variance optimization from scratch with cvxpy
Level 4 — Calculus & Optimization (4–5 weeks)
The language of change. Every Greek calculation and every neural net backprop is calculus.
- Taylor expansion — delta hedging is the first-order term, gamma hedging is second-order
- Why Itô calculus differs from ordinary calculus: the second-order Taylor term does not vanish for random processes (foreshadowing Level 5)
- Textbook: Boyd & Vandenberghe, Convex Optimization (free PDF, Stanford) — Ch 1–5
- Code drills: implement gradient descent from scratch, portfolio optimization with transaction costs
Level 5 — Stochastic Calculus (6–8 weeks, the hardest)
Per the author: “Before stochastic calculus, you’re a data scientist who likes finance. After it, you’re a quant.”
- Brownian motion — continuous-time random walk; the key fact is that
(dW_t)^2 = dt, which is what makes Itô’s lemma have its extra term - Geometric Brownian Motion — the default model for stock prices
- Itô’s lemma — the second-order correction that ordinary calculus drops but stochastic calculus can’t
- Black-Scholes derivation — apply Itô to an option price, construct a delta-hedged portfolio, cancel the
dW_tterms, and the drift μ vanishes entirely. This is the mind-bending moment: option prices do not depend on the stock’s expected return, so you can price as if everyone is risk-neutral - The Greeks — Δ (hedge ratio), Γ (convexity / re-hedge frequency), Θ (time decay cost), ν (vega, where vol desks make money), ρ (rate sensitivity)
- Textbook: Shreve, Stochastic Calculus for Finance II (gold standard). Alternative: Arguin, A First Course in Stochastic Calculus (newer, more accessible)
- Code drills: Black-Scholes from scratch, verify via Monte Carlo, compute all Greeks
Prediction markets bonus section
The author highlights Polymarket + LMSR (Logarithmic Market Scoring Rule, Robin Hanson) as the most interesting math playground right now: the price function is literally the softmax used in every neural network classifier, prices always sum to 1, infinite liquidity is guaranteed, and the market maker’s maximum loss is bounded at b * ln(n). This connects probability, information theory, convex optimization, and integer programming in one place.
Quant career archetypes and comp (2025 numbers, top tier)
| Archetype | Role | Skills |
|---|---|---|
| Quant Researcher | Finds patterns in petabytes, builds predictive models | PhD-level math/stats/ML; at Jane Street, tens of thousands of GPUs |
| Quant Dev/Engineer | Trading platforms, execution engines, real-time pipelines | Production C++/Rust/Python, low-latency systems |
| Quant Trader | Runs capital, manages risk, real-time decisions | Highest variance — 8 figures in exceptional years |
| Risk Quant | Model validation, VaR, stress testing, compliance | Steadier career, lower ceiling |
| AI/ML Quant (emerging) | Signal generation with deep learning | Fastest-growing role, hiring +88% YoY in 2025 |
Comp bands at top tier (Jane Street / Citadel / HRT):
- New grad: $300K–$500K+
- Mid career (3–7 yrs): $550K–$950K
- Senior (8+ yrs): $1M–$3M+
- Star trader/PM: $3M–$30M+
- Jane Street’s average employee comp was reported at $1.4M/yr in H1 2025
Mid tier (Two Sigma / DE Shaw): new grad $250K–$350K, mid $350K–$625K, senior $575K–$1.2M.
Interview gauntlet
Resume screen → online assessment (Zetamac mental math, target 50+) → phone screen (probability + betting games) → superday (3–5 back-to-back: mock trading, coding, whiteboard derivations). Jane Street gives problems intentionally too hard to solve alone — they test how you use hints and collaborate. Two-thirds of their recent intern class studied CS; a third studied math. Finance knowledge not required.
Prep resources:
- Xinfeng Zhou’s Green Book — A Practical Guide to Quantitative Finance Interviews (200+ real problems)
- QuantGuide.io (“LeetCode for quants”)
- Brainstellar
- Jane Street’s Figgie card game
Complete tool stack
Python
- Data: pandas, polars (10–50x faster on large datasets)
- Numerics: numpy, scipy
- Tabular ML: xgboost, lightgbm, catboost
- Deep ML: pytorch
- Optimization: cvxpy
- Derivatives: QuantLib (industry-grade C++ backend)
- Stats: statsmodels
- Backtesting: NautilusTrader (industrial), backtrader / vectorbt (starter)
- Quant research: Microsoft Qlib (17K+ stars, AI-oriented)
- RL for trading: FinRL (10K+ stars)
C++ / Rust
- QuantLib, Eigen, Boost (C++)
- RustQuant, NautilusTrader (Rust core + Python API)
Data sources
- Free: yfinance, Finnhub (60 calls/min), Alpha Vantage
- Mid-range: Polygon.io ($199/mo, sub-20ms latency), Tiingo
- Enterprise: Bloomberg Terminal (~$32K/yr), Refinitiv, FactSet
- Blockchain: Alchemy (free tier with archive access)
Solvers
- Gurobi — fastest commercial MIP solver, free academic license
- Google OR-Tools — strongest free solver
- PuLP / Pyomo — Python modeling interfaces
Full reading list (ordered)
Mathematics
- Blitzstein & Hwang — Introduction to Probability (free, Harvard)
- Strang — Introduction to Linear Algebra + MIT 18.06 lectures
- Wasserman — All of Statistics
- Boyd & Vandenberghe — Convex Optimization (free, Stanford)
- Shreve — Stochastic Calculus for Finance I & II
Quant finance
- Hull — Options, Futures, and Other Derivatives
- Natenberg — Option Volatility and Pricing
- López de Prado — Advances in Financial Machine Learning
- Ernest Chan — Quantitative Trading
- Zuckerman — The Man Who Solved the Market
Interview prep
- Zhou — Practical Guide to Quantitative Finance Interviews (Green Book)
- Crack — Heard on the Street
- Joshi — Quant Job Interview Questions
Competitions (useful for practice and signal)
- Jane Street Kaggle ($100K prize)
- WorldQuant BRAIN (100K+ users, pays for alpha signals — potentially interesting as a small bet of its own)
- Citadel Datathon (fast-track to employment)
- Jane Street monthly puzzles (above interview difficulty)
Author’s three hard-won lessons
- Estimation error is the real enemy. Full Kelly betting, unconstrained Markowitz, and ML models with too many features all fail for the same reason — overfitting noisy parameter estimates. The math works perfectly with true parameters; you never have true parameters.
- Tools have democratized, conviction hasn’t. QuantLib, Polygon, and PyTorch are now free or cheap. Technology is necessary but not sufficient. Edge lives in unique data, unique models, or unique execution.
- The math is the moat. AI can write code and suggest strategies, but fluency in why Itô’s lemma has its extra term, why discounted prices are martingales under the risk-neutral measure, when a convex relaxation is tight versus loose — that separates quants who build edge from quants who borrow it. Borrowed edge expires.
What this means for RDCO automated investing
This reframes the Automated Investing small bet from “let’s build an AI agent that trades” to “let’s build an AI agent that trades on top of a math foundation we actually understand.” Concretely:
- Don’t skip to strategy. The author’s warning about estimation error and noisy-BS applies double to us. Our first 10 backtested strategies will look good and be garbage. We need the statistics layer in place before we trust any backtest.
- The eigenvalue insight is load-bearing. Five eigenvectors explaining 70% of variance across 500 stocks is exactly the kind of compression a small-budget agent can exploit — PCA-based factor models scale to our constraints.
- Polymarket / LMSR is a lower-stakes sandbox. If we want to pressure-test probability and Bayesian updating without putting real capital on equities, prediction markets give us a cleaner feedback loop (binary outcomes, bounded losses, transparent math).
- The tool stack is mostly free. Python + yfinance + cvxpy + QuantLib + backtrader covers every Level 1–4 homework assignment. Polygon.io at $199/mo is the only mid-range spend I’d consider, and only after we have a working Level 2+ pipeline.
- Monitor fits here. Per 2026-04-10-claude-code-monitor-tool, the cleanest use case for Claude Code’s Monitor tool is streaming tick data from a subscriber CLI — not polling Notion. If/when we’re streaming live market data, Monitor is the right primitive.
Action items (triaged into the board elsewhere)
- Stub the five-level curriculum as a learning plan in
01-projects/automated-investing/curriculum.md, with estimated hours and weekly milestones - Set up a local Python env with numpy/pandas/scipy/cvxpy/yfinance and run the Level 1 coin-flip + Bayesian updater drills as a vibe check
- Evaluate WorldQuant BRAIN as a standalone micro-bet (they pay for alpha signals — low-risk way to monetize early statistics work)
- Flag Polymarket/LMSR as a prediction-markets sandbox for probability practice
- Download the free PDFs (Blitzstein, Boyd) into
06-reference/textbooks/so they’re on the Mac Mini
Related
- 01-projects/automated-investing/index — the active project this article feeds
- 2026-04-04-swing-trading-guide — Kevin Xu’s volume + catalysts approach, complementary style
- 2026-04-10-claude-code-monitor-tool — Monitor is the right primitive for streaming tick data
- 2026-04-08-four-levels-of-ai-use — this is a Level 3/4 project (things worth doing only because we have agentic automation)
Part 2 preview (not in this article, author teased)
Exotic derivatives (barriers, Asians, lookbacks), stochastic volatility (Heston calibration), jump-diffusion (Merton), martingale representation, Almgren-Chriss optimal execution, RL for market making, transformer architectures for financial time series, FPGA infra, WebSocket feeds, Frank-Wolfe with Gurobi for combinatorial arbitrage.