Relationships: Tying It Together (Ch 7)
Chapter 7 on relationships — the connections between entities. Uses a trail-running shoe purchase to show the same relationship expressed five different ways across camps:
- Relational: Foreign keys, constraints, referential integrity. Failure mode: expensive joins at scale
- Analytical: Facts and dimensions, historical context. Failure mode: treating analytical models as transactional source of truth
- Application: Embedded documents, streaming windows. Failure mode: unbounded embedding causing document bloat
- ML/AI: Vector distance, cosine similarity, probabilistic. Failure mode: treating probabilities as facts
- Knowledge: Semantic triples (subject-predicate-object), ontology-based inference. Failure mode: expensive ontology maintenance that drifts from reality
Covers structured relationships (1:1, 1:N, M:N with junction tables, recursive/self-join), semi-structured (embed vs. reference decision), and unstructured (metadata envelopes, embedded structural relationships, inferred/latent relationships). Key insight: in structured data you declare relationships; in unstructured data you discover them.
Chris Date’s test for when a relationship becomes an entity: the moment it needs to carry its own data.
RDCO relevance
The embed-vs-reference framework is useful for advising clients on JSON column design in dbt staging models. The five-camp relationship view reinforces why we need to understand client architecture holistically, not just the warehouse layer.