“Chapter 3: Modeling Business Dimensions” — Corr & Stagnitto

Why this is in the vault

This chapter is the practitioner’s playbook for turning event story details (from BEAM* modelstorming) into fully defined dimensions. It fills the gap between Kimball’s conceptual guidance on star schemas and the hands-on facilitation work we do with clients: how to sit with stakeholders, draw hierarchy charts on a whiteboard, and systematically discover attributes, hierarchies, and SCD policies using concrete data stories instead of abstract ER diagrams. For any Ray Data Co engagement that touches dimensional modeling — dbt semantic layers, warehouse redesigns, analytics engineering — this is the reference for how to run the modeling session, not just what the output should look like.

Key concepts

Dimension stories

Dimensions lack the narrative punch of event stories (events have active verbs; dimensions have static “has” and “is” relationships). To compensate, BEAM* modelers inject drama through two plot devices:

Hierarchy charts — ask stakeholders how they organize dimension members into groups, which surfaces low-cardinality grouping attributes and exposes informal data sources (spreadsheets, personal databases) that OLTP systems never capture.
Change stories — ask stakeholders how dimension members change over time, which reveals SCD policies and can uncover auxiliary business processes worth modeling as separate events.

Discovering and documenting dimensions

Each event detail with additional descriptive attributes becomes a candidate dimension. The chapter walks through a repeatable sequence:

Subject: the event detail becomes the first mandatory attribute (code MD).
Business key (BK): ask stakeholders for a unique, stable, mandatory identifier. Watch for “smart keys” with embedded meaning that will rot over time.
Attributes: use the 7Ws as a discovery checklist when stakeholder suggestions dry up. For each candidate, apply the moment-in-time test: “Can a [dimension] have more than one [attribute] at any moment?” A NO means it belongs; a YES means it may be a separate dimension or require granularity adjustment.
Exclusive attributes (Xn): flag mutually incompatible attribute groups (e.g., consumer vs. business attributes on a CUSTOMER dimension). Track their defining characteristic (DC) — the low-cardinality mandatory attribute that controls which group is valid.
Missing values: every dimension needs a missing-member row. Different flavors of missing (“Not Applicable,” “To be assigned”) may require separate missing stories.
Descriptive attributes: decode all smart keys into BI-friendly labels. The guiding principle is to push decode logic into ETL so reports never need SQL-level decoding.

Hierarchies

Three structural types, each available in single-parent or multi-parent form:

Type	Levels	Depth	Example
Balanced	Fixed count, each with a unique name	Uniform across all paths	Calendar: Day > Month > Quarter > Year
Ragged	Known max count with named levels	Some paths skip levels (nulls)	Product hierarchy where some items have no subcategory
Variable depth	Unknown count, unnamed levels	Arbitrary nesting	Org chart (recursive manager-employee relationship)

Multi-parent hierarchies arise when a member rolls up to more than one parent (e.g., a product belonging to multiple product types). These require careful fact allocation to avoid double-counting at higher roll-up levels. Variable-depth and multi-parent hierarchies cannot be modeled in plain dimension tables — they need hierarchy maps (covered in Chapter 6).

Hierarchy charts are the whiteboard tool: a vertical bar per dimension, tick marks for levels, annotations for cardinality, optional levels (brackets), multi-parent relationships (double bars), and recursive loops. They double as query-definition diagrams when annotated with X (WHERE clause) and O (GROUP BY) markers across an event’s dimensions.

Key modeling heuristics:

Two “M”s between levels = M:M = levels belong in separate hierarchies.
Two “1”s = 1:1 = same level; keep the more descriptive label.
Always check for “hot” planning levels (marked with *) that will drive aggregates and rollup dimensions.
Rearrange dimension columns in hierarchical order (low to high, left to right) after completing charts.

SCD history — CV, FV, HV (and hybrids)

The chapter reframes Kimball’s SCD types in stakeholder-friendly language:

FV (Fixed Value / Type 0): attribute does not change; updates are corrections only (e.g., Date of Birth). Business keys should default to FV.
CV (Current Value / Type 1): overwrites with the latest value. Gives “as is” reporting that matches operational systems but destroys history and makes reports non-reproducible over time.
HV (Historic Value / Type 2): preserves each version. Supports “as was” reporting and can also recast to current values for “as is” views. More ETL work, but the most flexible option. The authors recommend storing HV even when stakeholders initially request CV, since switching later is expensive.
HV/CV or CV/HV (hybrid): both historical and current columns stored physically. Default reporting behavior is the first code listed. Implemented via the hybrid SCD pattern (separate columns or hot-swappable dimensions).
CV/PV (Type 3): stores current value plus one previous value. Useful for infrequently changing attributes where stakeholders only care about the last transition.

Change stories

Change stories are documented by adding a row to the BEAM* dimension table and filling it with stakeholders. The process per attribute:

Ask: “Can the [attribute] of a [dimension] change?”
If NO — label FV, copy the typical example value into the change row.
If YES — ask: “Will you need its historic values for grouping/filtering?” If YES — label HV, ask for a new changed-value example, leave the typical example untouched. If NO — label CV, update both the typical and change rows to the new value so the overwrite is visually obvious.

Group change rules help ETL distinguish corrections from genuine changes. The example given: if a customer’s street address changes but ZIP stays the same, treat it as a correction; if both change together, treat it as a real move and preserve history.

Mapping against Ray Data Co

dbt semantic layer work: the attribute-discovery checklist (7Ws, moment-in-time test, exclusive attributes) translates directly into how we audit client dimension tables during dbt model reviews. Many client dimensions fail the “no SQL decodes at query time” test — we can use that as a concrete quality criterion.
Client facilitation: the hierarchy chart technique is a lightweight, whiteboard-friendly alternative to full ER diagramming when running discovery workshops. It is more accessible to non-technical stakeholders than jumping straight to dbt YAML or star-schema DDL.
SCD decisions: the FV/CV/HV framing is cleaner for stakeholder conversations than Kimball’s numeric types (0/1/2/3). Positioning HV as the safe default with CV as a reporting overlay (not a storage decision) avoids the common trap where clients pick Type 1 early and regret it later.
Newsletter angle: the concept of “smart keys” decaying over time as business processes evolve is a strong tangible example for a Sanity Check issue on data quality debt.

06-reference/2026-04-03-the-data-warehouse-toolkit — Kimball’s original dimensional modeling reference; this chapter operationalizes Kimball’s SCD types through stakeholder storytelling
06-reference/2025-12-31-practical-data-modeling-kimball-vs-inmon — Kimball vs. Inmon debate context; BEAM* sits firmly in the Kimball camp but adds agile facilitation techniques
06-reference/2026-02-18-practical-data-modeling-mma-ch1-full — Mixed Model Artist framing; BEAM* is primarily “Analytics Camp” tooling
06-reference/2026-04-04-dedp-dwh-mdm-datalake-reverse-etl-cdp — MDM discussion is directly relevant to the business-key discovery problem (multiple source systems, no single customer ID)
06-reference/2026-04-04-dedp-mv-obt-dbt-olap-dwa — OLAP cube dimensionality discussion connects to the hierarchy chart query-modeling technique
06-reference/2026-04-04-sql-patterns-xheblati — SQL patterns for implementing the SCD designs this chapter specifies at the logical level