Hands-on Machine Learning — Aurelien Geron

Summary

Geron’s practical ML textbook, highlighted for foundational concepts rather than code. The highlights capture the conceptual scaffolding that makes ML intuitive. Core mental models:

When ML Is the Right Tool. Three conditions: (a) Problems where hand-tuned rules are too complex, (b) problems with no good traditional solution, (c) getting insights from large/complex datasets. If you can write explicit rules that work, you probably don’t need ML.
Feature vs. Attribute. An attribute is a data type (“Mileage”). A feature is an attribute plus its value (“Mileage = 15,000”). This distinction matters when discussing data schemas versus model inputs.
The Bias-Variance Zoo. Overfitting: model is too complex relative to data quantity and noise. Sampling bias: nonrepresentative data from flawed collection method (even large samples can have this). Data snooping bias: using information you wouldn’t have had at prediction time. Each has a different remedy.
Dimensionality Reduction as Pre-Processing. Reducing training data dimensions before feeding to another algorithm often improves speed, storage, and sometimes even performance. The counterintuitive insight: throwing away some data can make your model better.
Pipeline Resilience and Silent Failures. In ML pipelines, a broken upstream component can go unnoticed because downstream components continue using stale output. The system degrades silently. Without proper monitoring, data gets stale and overall performance drops gradually rather than crashing obviously.
Cost Functions vs. Utility Functions. You can frame model evaluation as minimizing how bad (cost function) or maximizing how good (utility function). The choice affects optimization behavior and interpretation.

Relevance

06-reference/2026-04-03-data-science-for-business — Geron’s “when to use ML” criteria complement Provost’s canonical task taxonomy. Together they answer: what task type, and whether ML is even the right approach.
06-reference/2026-04-03-feature-stores-hierarchy — The feature vs. attribute distinction is why feature stores exist — they manage the mapping from raw attributes to model-ready features.
06-reference/2026-04-03-ghosts-in-the-data — Silent pipeline failures are exactly the “ghosts” that haunt data systems. Monitoring for data freshness is as critical as monitoring for errors.

Open Questions

In the LLM era, does dimensionality reduction still matter, or do transformer architectures handle high-dimensional input natively?
How should silent failure monitoring adapt to AI agent systems where the “pipeline” is a conversation chain?