06-reference

practical data modeling mma ch7 relationships

Wed Mar 11 2026 20:00:00 GMT-0400 (Eastern Daylight Time) ·reference ·source: Practical Data Modeling (Substack) ·by Joe Reis
data-modelingrelationshipscardinalityjunction-tablesknowledge-graphsembeddingsmixed-model-artschapter-7

Relationships: Tying It Together (Ch 7)

Chapter 7 on relationships — the connections between entities. Uses a trail-running shoe purchase to show the same relationship expressed five different ways across camps:

  1. Relational: Foreign keys, constraints, referential integrity. Failure mode: expensive joins at scale
  2. Analytical: Facts and dimensions, historical context. Failure mode: treating analytical models as transactional source of truth
  3. Application: Embedded documents, streaming windows. Failure mode: unbounded embedding causing document bloat
  4. ML/AI: Vector distance, cosine similarity, probabilistic. Failure mode: treating probabilities as facts
  5. Knowledge: Semantic triples (subject-predicate-object), ontology-based inference. Failure mode: expensive ontology maintenance that drifts from reality

Covers structured relationships (1:1, 1:N, M:N with junction tables, recursive/self-join), semi-structured (embed vs. reference decision), and unstructured (metadata envelopes, embedded structural relationships, inferred/latent relationships). Key insight: in structured data you declare relationships; in unstructured data you discover them.

Chris Date’s test for when a relationship becomes an entity: the moment it needs to carry its own data.

RDCO relevance

The embed-vs-reference framework is useful for advising clients on JSON column design in dbt staging models. The five-camp relationship view reinforces why we need to understand client architecture holistically, not just the warehouse layer.