Chapter 7: When and Where --- Design Patterns — Corr & Stagnitto
Why this is in the vault
Time and location are the two dimensions that appear in virtually every fact table. Getting them wrong cascades into broken YTD comparisons, incorrect fiscal reporting, and ETL pipelines that silently produce wrong numbers. This chapter is a pattern catalog for every time/location modeling decision Ray Data Co will encounter in client engagements --- from basic calendar dimensions through multinational holiday handling and journey analysis. The patterns here complement Kimball’s Toolkit coverage but go deeper on operational concerns like fact-specific calendars and date version keys.
Key concepts
Time dimensions (the “why bother” case)
Every fact should join to a physical time dimension rather than relying on raw timestamps. The rationale is efficiency and consistency: descriptive time attributes (day of week, fiscal period, holiday flag) are derived once in the dimension rather than recalculated by every query. Many attributes --- fiscal periods, holidays, seasons --- cannot be derived from timestamps at all because they are organization- or geography-specific.
Splitting date from time of day
Time is best modeled as two separate physical dimensions: a calendar (date) dimension and a clock (time-of-day) dimension. This reflects how users actually query: most analysis groups by sets of days (months, quarters, fiscal periods), while a smaller set of queries groups by periods within a day (shifts, peak/off-peak). Separating them keeps each dimension small, makes fact-table time granularity explicit, and avoids the combinatorial explosion of merging them (365 x 1,440 = 525,600 rows per year at minute grain).
Calendar dimensions
A well-built calendar dimension pre-computes all the groupings stakeholders need: day, week, month, quarter, year, fiscal period, fiscal year, holiday flags, and epoch counters (“day overall,” “month overall”) that enable cross-year-boundary intervals like “last 60 days.” Calendar dimensions are role-playing --- a single physical table can serve as ORDER DATE, DELIVERY DATE, etc. through aliased views. They are small (20 years of history + 10 future = ~7,300 rows) and should be pre-populated for the foreseeable future. Fiscal calendar attributes typically come from Finance; holiday schedules from HR or national calendars; seasonal info from Sales/Marketing.
Date keys
Unlike other surrogate keys, date keys should follow a consistent calendar-order sequence, enabling partition range scans and BETWEEN joins. Two main formats:
- ISO date keys (YYYYMMDD) --- easy for ETL to generate from source dates, human-readable, good for partition setup, but tempt BI users to filter on the key directly instead of using dimension attributes.
- Epoch-based keys --- integer offset from an origin date (e.g., 1/1/1900). More compact, harder to misuse, but less readable.
A date-keyed version of the calendar (materialized view keyed on the actual date rather than the surrogate) is useful as an outrigger for joining date-typed attributes like FIRST_PURCHASE_DATE or HIRE_DATE to the full calendar, but should not replace the surrogate-keyed calendar for fact tables.
Period calendars and month dimensions
Periodic snapshots and aggregate fact tables at weekly, monthly, or quarterly grain should not reuse the daily calendar --- doing so misleads users into thinking they can filter by day-level attributes on monthly data. Instead, create a separate MONTH rollup dimension (derivable as a materialized view from CALENDAR) with its own MONTH KEY. This makes the grain explicit. MONTH KEY can be set to MAX(DATE_KEY) within the month so that BI tools needing a single calendar table can fall back to CALENDAR at query time.
Offset calendars
Fact-specific origin dates (e.g., policy creation date for insurance) need their own offset calendar so that “month 4” means four months since the policy started, not April. These can coexist with the standard calendar on the same fact table, doubling rows but enabling queries from either perspective.
Year-to-date comparisons
YTD analysis is deceptively hard. Three questions must be nailed down: (1) which “from date” --- calendar year start or fiscal year start? (2) what does “to date” mean when different fact tables have different load schedules? (3) should prior-year YTD match by same calendar date or same number of elapsed days?
The solution is a FACT STATE table that records the most recent load date and last complete load date for each fact table, updated by ETL. This metadata is then merged into a fact-specific calendar --- a view that cross-joins FACT STATE with CALENDAR so every calendar row carries the YTD status of its associated fact table. The fact-specific calendar turns SYSDATE-based YTD hacks into simple attribute filters.
Conformed date ranges
When comparing YTD across multiple fact tables (e.g., sales vs. commissions), use the earliest LAST COMPLETE DATE from FACT STATE across all involved tables to define a conformed date range. Otherwise, comparing tables with different load cadences produces misleading totals.
Clock dimensions
A clock dimension holds time-of-day descriptions at minute granularity (1,440 rows + an unknown/N/A row). Attributes include hour, minute, AM/PM, minute in day, work shift, day/night, and peak/off-peak. Unlike CALENDAR (which is fixed-value), CLOCK is often historically variable --- shift definitions and peak periods change over time, making it an HV (historically variable) dimension.
Day Clock pattern
When time-of-day attributes like peak/off-peak vary by day type (weekday vs. weekend vs. holiday), create a DAY CLOCK dimension with versions of each minute per day type. This avoids a full date-times cross-product; ~14 day-type variations yield ~20,000 rows vs. 525,600 for a naive merge. The TIME KEY in this pattern is a normal surrogate (not time-derived), allowing new versions to be added as shift definitions change.
International time
Global operations need both local and standard (e.g., UTC) time perspectives. The solution is dimensional overloading: add LOCAL DATE KEY and LOCAL TIME KEY to the fact table alongside the standard keys, letting CALENDAR and CLOCK play dual roles. ETL handles all timezone conversions centrally.
Multinational calendar pattern
Holidays and seasons are geopolitical --- they vary by both date and location. For a handful of countries, repeating attribute groups (ENGLISH HOLIDAY FLAG, WELSH HOLIDAY FLAG, etc.) suffices. Beyond that, a MULTINATIONAL CALENDAR stores geopolitical versions of each date, each with its own DATE KEY. A single date like March 17 may have three versions (US/UK, Ireland, South Africa) with different SEASON and HOLIDAY values. Queries that ignore geopolitical attributes naturally roll up correctly; only queries filtering on SEASON or HOLIDAY surface the versioning. Country-specific calendar views (pre-filtered to one country) are an anti-pattern for international fact tables because they silently prevent cross-border analysis.
Date version keys
When using multinational calendars, append a fixed-length version number to the date key (e.g., YYYYMMDDVV or epoch + VV) to preserve sort order and enable partition range scans. Start with a “00” standard version for every date and add versions only when geopolitical attributes actually differ --- most dates globally will need only one version.
Location dimensions and journey analysis
Location is the spatial counterpart to time. Movement events (“from” and “to” prepositions in stakeholder stories) produce pairs of when/where details with derived measures like distance, duration, and speed. A flight fact table, for example, uses AIRPORT as a role-playing dimension for departure and arrival, with CALENDAR and CLOCK each playing departure/arrival roles too. Individual movements may compose into multi-leg journeys, which are analytically valuable but hard to query from a flat event table --- stakeholders asking “where do employees travel to?” really want journey-level origin/destination, not individual flight legs.
Mapping against Ray Data Co
- Client warehouse builds. The calendar dimension pattern (with epoch counters, fiscal periods, and holiday flags) is table-stakes for any DW engagement. The fact-specific calendar / FACT STATE pattern for YTD is less commonly implemented but solves a real pain point --- clients frequently complain that YTD comparisons are wrong because different tables load on different schedules. This is a concrete value-add to propose during design workshops.
- Data quality framework. FACT STATE is essentially data freshness metadata. It aligns with the data quality monitoring work --- recency and completeness checks map directly to MOST RECENT LOAD DATE and LAST COMPLETE LOAD DATE. Consider surfacing FACT STATE attributes in any data quality dashboard we build.
- Multinational considerations. Most current client work is US-centric, but the multinational calendar pattern is worth filing for when a client goes global. The key takeaway: version the date key from day one (add “00” suffix) so the schema doesn’t need to be rebuilt when international expansion happens.
- Clock dimension for operational analytics. Clients doing operational or behavioral analytics (web traffic, IoT, call center) benefit from the DAY CLOCK pattern. The peak/off-peak and shift attributes are exactly what these use cases need, and separating them from the calendar avoids combinatorial blowup.
- Journey/movement modeling. The flight-fact star schema is a useful reference pattern for any client modeling logistics, supply chain, or user navigation flows. The “from/to” heuristic for detecting movement stories in stakeholder language is a practical facilitation tip for BEAM workshops.
Related
- 06-reference/2026-04-03-the-data-warehouse-toolkit --- Kimball’s calendar dimension coverage is foundational; Corr extends it with FACT STATE, fact-specific calendars, and multinational versioning
- 06-reference/2026-04-04-building-the-event-clock --- Kirk Marple’s event clock vs. state clock distinction; the FACT STATE table here is a primitive event clock --- it captures when state changed but not why
- 06-reference/2026-04-04-dedp-dwh-mdm-datalake-reverse-etl-cdp --- warehouse architecture patterns that these time/location dimensions plug into
- 01-projects/data-quality-framework/testing-matrix-template --- FACT STATE maps directly to freshness/completeness test dimensions