06-reference

feature stores hierarchy

Thu Apr 02 2026 20:00:00 GMT-0400 (Eastern Daylight Time) ·article ·source: https://eugeneyan.com/writing/feature-stores/ ·by Eugene Yan

Feature Stores: A Hierarchy of Needs — Eugene Yan

Summary

Yan applies Maslow’s hierarchy of needs to feature stores, arguing that teams should satisfy foundational needs before reaching for advanced capabilities:

  1. Access (base) — documented features with transparency into logic/code and lineage to upstream sources. Enables discovery, sharing, and reuse.
  2. Serving — features available in production at high throughput and low latency, with offline-to-online sync.
  3. Integrity — point-in-time correctness (time travel) to prevent data leakage in training sets. Monitoring of model performance.
  4. Convenience — interactive exploration and experimentation with features.
  5. Autopilot (top) — automated backfilling, alerting, and self-healing.

The mental model: don’t build serving infrastructure before you’ve solved feature discovery. Don’t automate what you haven’t validated.

Relevance

Maps directly to 06-reference/2026-03-31-block-hierarchy-to-intelligence — both articles argue for layered capability building where each layer depends on the one below. The feature store hierarchy is a domain-specific instance of the general pattern.

Connects to 06-reference/2026-04-03-data-products-taxonomy — feature stores are Weber’s “ML platform” data product, and this hierarchy shows that even within one data product category, there’s a maturity sequence.

For 01-projects/phdata/index, this is a useful framework for scoping ML infrastructure engagements: assess where the client sits on the hierarchy and build the next layer, not the top layer. Also relevant to 01-projects/phdata/career-transition — demonstrates systems thinking about ML infrastructure.

The access layer (documentation, lineage, reuse) connects to 06-reference/concepts/compounding-knowledge — feature reuse is knowledge compounding in infrastructure form. Also resonates with 06-reference/2026-04-03-combining-rule-engines-ml — the integrity layer (preventing data leakage) matters most when combining rule-based and ML approaches.

Open Questions