06-reference

book adwd ch7 when where design patterns 2026 04 13

Sun Apr 12 2026 20:00:00 GMT-0400 (Eastern Daylight Time) ·reference ·source: Agile Data Warehouse Design (book) ·by Lawrence Corr, Jim Stagnitto

Chapter 7: When and Where --- Design Patterns — Corr & Stagnitto

Why this is in the vault

Time and location are the two dimensions that appear in virtually every fact table. Getting them wrong cascades into broken YTD comparisons, incorrect fiscal reporting, and ETL pipelines that silently produce wrong numbers. This chapter is a pattern catalog for every time/location modeling decision Ray Data Co will encounter in client engagements --- from basic calendar dimensions through multinational holiday handling and journey analysis. The patterns here complement Kimball’s Toolkit coverage but go deeper on operational concerns like fact-specific calendars and date version keys.

Key concepts

Time dimensions (the “why bother” case)

Every fact should join to a physical time dimension rather than relying on raw timestamps. The rationale is efficiency and consistency: descriptive time attributes (day of week, fiscal period, holiday flag) are derived once in the dimension rather than recalculated by every query. Many attributes --- fiscal periods, holidays, seasons --- cannot be derived from timestamps at all because they are organization- or geography-specific.

Splitting date from time of day

Time is best modeled as two separate physical dimensions: a calendar (date) dimension and a clock (time-of-day) dimension. This reflects how users actually query: most analysis groups by sets of days (months, quarters, fiscal periods), while a smaller set of queries groups by periods within a day (shifts, peak/off-peak). Separating them keeps each dimension small, makes fact-table time granularity explicit, and avoids the combinatorial explosion of merging them (365 x 1,440 = 525,600 rows per year at minute grain).

Calendar dimensions

A well-built calendar dimension pre-computes all the groupings stakeholders need: day, week, month, quarter, year, fiscal period, fiscal year, holiday flags, and epoch counters (“day overall,” “month overall”) that enable cross-year-boundary intervals like “last 60 days.” Calendar dimensions are role-playing --- a single physical table can serve as ORDER DATE, DELIVERY DATE, etc. through aliased views. They are small (20 years of history + 10 future = ~7,300 rows) and should be pre-populated for the foreseeable future. Fiscal calendar attributes typically come from Finance; holiday schedules from HR or national calendars; seasonal info from Sales/Marketing.

Date keys

Unlike other surrogate keys, date keys should follow a consistent calendar-order sequence, enabling partition range scans and BETWEEN joins. Two main formats:

A date-keyed version of the calendar (materialized view keyed on the actual date rather than the surrogate) is useful as an outrigger for joining date-typed attributes like FIRST_PURCHASE_DATE or HIRE_DATE to the full calendar, but should not replace the surrogate-keyed calendar for fact tables.

Period calendars and month dimensions

Periodic snapshots and aggregate fact tables at weekly, monthly, or quarterly grain should not reuse the daily calendar --- doing so misleads users into thinking they can filter by day-level attributes on monthly data. Instead, create a separate MONTH rollup dimension (derivable as a materialized view from CALENDAR) with its own MONTH KEY. This makes the grain explicit. MONTH KEY can be set to MAX(DATE_KEY) within the month so that BI tools needing a single calendar table can fall back to CALENDAR at query time.

Offset calendars

Fact-specific origin dates (e.g., policy creation date for insurance) need their own offset calendar so that “month 4” means four months since the policy started, not April. These can coexist with the standard calendar on the same fact table, doubling rows but enabling queries from either perspective.

Year-to-date comparisons

YTD analysis is deceptively hard. Three questions must be nailed down: (1) which “from date” --- calendar year start or fiscal year start? (2) what does “to date” mean when different fact tables have different load schedules? (3) should prior-year YTD match by same calendar date or same number of elapsed days?

The solution is a FACT STATE table that records the most recent load date and last complete load date for each fact table, updated by ETL. This metadata is then merged into a fact-specific calendar --- a view that cross-joins FACT STATE with CALENDAR so every calendar row carries the YTD status of its associated fact table. The fact-specific calendar turns SYSDATE-based YTD hacks into simple attribute filters.

Conformed date ranges

When comparing YTD across multiple fact tables (e.g., sales vs. commissions), use the earliest LAST COMPLETE DATE from FACT STATE across all involved tables to define a conformed date range. Otherwise, comparing tables with different load cadences produces misleading totals.

Clock dimensions

A clock dimension holds time-of-day descriptions at minute granularity (1,440 rows + an unknown/N/A row). Attributes include hour, minute, AM/PM, minute in day, work shift, day/night, and peak/off-peak. Unlike CALENDAR (which is fixed-value), CLOCK is often historically variable --- shift definitions and peak periods change over time, making it an HV (historically variable) dimension.

Day Clock pattern

When time-of-day attributes like peak/off-peak vary by day type (weekday vs. weekend vs. holiday), create a DAY CLOCK dimension with versions of each minute per day type. This avoids a full date-times cross-product; ~14 day-type variations yield ~20,000 rows vs. 525,600 for a naive merge. The TIME KEY in this pattern is a normal surrogate (not time-derived), allowing new versions to be added as shift definitions change.

International time

Global operations need both local and standard (e.g., UTC) time perspectives. The solution is dimensional overloading: add LOCAL DATE KEY and LOCAL TIME KEY to the fact table alongside the standard keys, letting CALENDAR and CLOCK play dual roles. ETL handles all timezone conversions centrally.

Multinational calendar pattern

Holidays and seasons are geopolitical --- they vary by both date and location. For a handful of countries, repeating attribute groups (ENGLISH HOLIDAY FLAG, WELSH HOLIDAY FLAG, etc.) suffices. Beyond that, a MULTINATIONAL CALENDAR stores geopolitical versions of each date, each with its own DATE KEY. A single date like March 17 may have three versions (US/UK, Ireland, South Africa) with different SEASON and HOLIDAY values. Queries that ignore geopolitical attributes naturally roll up correctly; only queries filtering on SEASON or HOLIDAY surface the versioning. Country-specific calendar views (pre-filtered to one country) are an anti-pattern for international fact tables because they silently prevent cross-border analysis.

Date version keys

When using multinational calendars, append a fixed-length version number to the date key (e.g., YYYYMMDDVV or epoch + VV) to preserve sort order and enable partition range scans. Start with a “00” standard version for every date and add versions only when geopolitical attributes actually differ --- most dates globally will need only one version.

Location dimensions and journey analysis

Location is the spatial counterpart to time. Movement events (“from” and “to” prepositions in stakeholder stories) produce pairs of when/where details with derived measures like distance, duration, and speed. A flight fact table, for example, uses AIRPORT as a role-playing dimension for departure and arrival, with CALENDAR and CLOCK each playing departure/arrival roles too. Individual movements may compose into multi-leg journeys, which are analytically valuable but hard to query from a flat event table --- stakeholders asking “where do employees travel to?” really want journey-level origin/destination, not individual flight legs.

Mapping against Ray Data Co