Chapter 4: Modeling Business Processes — Corr & Stagnitto
Why this is in the vault
This chapter is the operational core of BEAM (Business Event Analysis & Modeling). It bridges the gap between modeling a single event table (Ch 2-3) and planning an entire data warehouse. The event matrix it introduces is the single most practical planning artifact for dimensional warehouse design — it forces conformance decisions early, exposes process sequences, and produces a prioritized product backlog. For Ray Data Co, every data modeling engagement should start with something like this matrix rather than jumping straight to dbt models.
Key concepts
Conformed dimensions
A conformed dimension is a single physical dimension shared by multiple fact tables, or separate dimensions that contain identical attributes with identical business meanings. They are what makes cross-process analysis possible. Without them, you get the silo data mart anti-pattern: each department gets its own mart, each defines PRODUCT or CUSTOMER slightly differently, and cross-department reporting becomes impossible. Rebuilding siloed marts is extremely expensive because you are refactoring historical data, not just code.
Three types exist at the attribute conformance level:
- Swappable [SD] — subsets of a larger conformed dimension (e.g. WARRANTY CUSTOMER is a subset of CUSTOMER), enabling filtered comparisons.
- Rollup [RU] — aggregated versions sharing attributes with a base dimension (e.g. MONTH rolls up from CALENDAR), used for planning events and aggregate fact tables.
- Role-playing [RP] — a single dimension used in multiple logical roles (e.g. CALENDAR playing ORDER DATE and SHIP DATE). Identifying RP dimensions early avoids costly refactoring after deployment.
Conformance is fundamentally a political problem, not a technical one. Homonyms (same term, different meaning across departments) and synonyms (different terms, same meaning) are the usual blockers. Modelstorming with examples from both sides is the most effective way to surface and resolve these conflicts.
The data warehouse bus
Conformed dimensions define a data warehouse bus — an integration standard that lets independently built data marts plug in and interoperate, analogous to USB. The bus requires more upfront work (modeling multiple processes, confronting political conformance challenges, building MDM-aware ETL) but the payback is dramatically less technical debt and the ability for teams to work in parallel once the bus is defined. Local (non-conformed) dimensions are always permitted in addition to bus dimensions, never as replacements.
The event matrix
The event matrix is a grid with business processes as rows and dimensions as columns, tick marks at intersections. It is the central planning artifact for the bus. Two ways to read it:
- Scan columns to find conformed dimensions (multiple ticks = cross-process reuse).
- Scan rows to estimate process complexity (more dimensions = more modeling effort).
The enhanced version (the BEAM event matrix) adds: event sequences with main clauses, dimensions ordered by 7W type (who/what/where/why/how), stakeholder group columns for ownership, and importance/estimate rows for Scrum-style prioritization. Events that create new dimension values get a star rather than a tick — a significant dependency signal.
Event sequences
Events on the matrix are ordered by value chain (increasing business value output), not alphabetically. This roughly maps to time sequence at a macro level. Within a value chain, stricter process sequences exist — chains of milestone events that must occur in order (e.g. purchase order, then delivery, then payment). These are denoted by indentation on the matrix.
Key discovery technique: ask “why does X happen?” — if the answer points to a prior event, you have found a process sequence linked by a conformed degenerate dimension (like ORDER ID appearing on both orders and shipments). Process sequences can be modeled as a single evolving event for process performance measurement (late deliveries, average fulfillment time).
Gaps in the matrix between events often signal missing business processes. Asking “does anything costly happen between X and Y?” reliably surfaces them.
Role-playing dimensions
When a new event introduces what looks like a new dimension (e.g. WAREHOUSE WORKER), check whether it is actually a role of an existing conformed dimension (EMPLOYEE). The W-type helps: if there are already two “who” dimensions, ask stakeholders whether the new one shares attributes. If yes, rename to the generalized form (EMPLOYEE [RP]) and document the specific role in the event detail using bracket notation ([employee]).
Generalization is powerful but carries BI usability risks. If stakeholders see no analytical benefit to combining entities, postpone generalization until star schema design — the modeling stage should produce dimensions stakeholders recognize and want to query.
The event rating game
A structured prioritization exercise for deciding what to build first:
- Every event gets a unique importance rating in 100-point increments (100, 200, 300…).
- Already-completed events score 0; truly unimportant ones can share 100.
- Dimensions are then rated higher than their highest-importance event (in 5-10 point increments), with conformed dimensions rated above non-conformed ones.
- Events, dimensions, and BI reports all merge onto a single product backlog, sorted by importance.
This produces a Scrum-ready backlog directly from the modelstorming session. The product owner adjusts ratings later using source data profiling and ETL estimates.
Mapping against Ray Data Co
- Engagement kickoff tool. The event matrix should be the first deliverable in any data modeling engagement — before writing dbt models, before selecting tools. It forces the right conversations about scope, conformance, and priority. A two-hour modelstorming session with stakeholders would produce a matrix that guides months of sprint work.
- Silo data mart warning. The chapter’s anti-pattern — department-scoped agile sprints producing incompatible marts — is exactly what happens when dbt projects start without cross-functional alignment. The bus matrix is the prevention.
- Process sequences for pipeline design. Identifying which events are milestones of a larger process directly informs how to structure incremental/evolving fact tables, which is a common dbt modeling decision (one wide fact vs. separate milestone facts).
- Rating game for client prioritization. The 100-point-increment scoring system is a clean facilitation technique for clients who struggle to prioritize data work. Worth incorporating into discovery workshops.
- Conformed degenerate dimensions. The insight that shared transaction IDs (ORDER ID appearing on orders, shipments, and payments) signal process sequences is immediately useful for lineage analysis and data quality checks.
Related
- 06-reference/2026-04-03-the-data-warehouse-toolkit — Kimball’s original bus matrix concept; Corr’s event matrix extends it with BEAM story-telling, 7W ordering, and Scrum integration
- 06-reference/2025-12-31-practical-data-modeling-kimball-vs-inmon — The Kimball vs. Inmon synthesis; Corr is firmly in the Kimball camp but the bus matrix is compatible with hybrid approaches
- 06-reference/2026-04-04-building-the-event-clock — Marple’s event clock vs. state clock distinction; BEAM’s event-first modeling is a systematic way to capture the event clock that warehouses typically miss
- 06-reference/2026-02-18-practical-data-modeling-mma-ch1-full — Reis’s Mixed Model Arts; the event matrix is one concrete technique for deciding which modeling approach fits which business process
- 06-reference/2026-04-04-dedp-dwh-mdm-datalake-reverse-etl-cdp — MDM and conformed dimensions are closely related; the data warehouse bus requires MDM-like governance for dimension conformance
- 01-projects/data-quality-framework/testing-matrix-template — The event matrix’s dimension-tick pattern could inform data quality test coverage planning