Feature Stores: A Hierarchy of Needs — Eugene Yan
Summary
Yan applies Maslow’s hierarchy of needs to feature stores, arguing that teams should satisfy foundational needs before reaching for advanced capabilities:
- Access (base) — documented features with transparency into logic/code and lineage to upstream sources. Enables discovery, sharing, and reuse.
- Serving — features available in production at high throughput and low latency, with offline-to-online sync.
- Integrity — point-in-time correctness (time travel) to prevent data leakage in training sets. Monitoring of model performance.
- Convenience — interactive exploration and experimentation with features.
- Autopilot (top) — automated backfilling, alerting, and self-healing.
The mental model: don’t build serving infrastructure before you’ve solved feature discovery. Don’t automate what you haven’t validated.
Relevance
Maps directly to 06-reference/2026-03-31-block-hierarchy-to-intelligence — both articles argue for layered capability building where each layer depends on the one below. The feature store hierarchy is a domain-specific instance of the general pattern.
Connects to 06-reference/2026-04-03-data-products-taxonomy — feature stores are Weber’s “ML platform” data product, and this hierarchy shows that even within one data product category, there’s a maturity sequence.
For 01-projects/phdata/index, this is a useful framework for scoping ML infrastructure engagements: assess where the client sits on the hierarchy and build the next layer, not the top layer. Also relevant to 01-projects/phdata/career-transition — demonstrates systems thinking about ML infrastructure.
The access layer (documentation, lineage, reuse) connects to 06-reference/concepts/compounding-knowledge — feature reuse is knowledge compounding in infrastructure form. Also resonates with 06-reference/2026-04-03-combining-rule-engines-ml — the integrity layer (preventing data leakage) matters most when combining rule-based and ML approaches.
Open Questions
- Where does a Snowflake-native feature store (Snowpark + dynamic tables) sit vs. purpose-built tools like Feast or Tecton?
- Ray’s note about batch inference — does a batch-only workflow let you skip the “serving” layer and jump to integrity?
- How does the feature store hierarchy interact with the headless BI metrics layer (06-reference/2026-04-03-headless-bi)? Are features and metrics converging?