Graph DB — Vertex and Edge Type Dictionary
Running data dictionary for the typed knowledge graph we’re prototyping on DuckDB. Maintained incrementally — add new types as they emerge from real query needs.
Vertex types (nodes)
| Type | Purpose | Examples |
|---|---|---|
Document | Vault entries (source of most edges) | any 06-reference/*.md file |
Person | Person who wrote source material | Ben Thompson, João Moura, Jaya Gupta, Joe Reis, Ben Wilson |
Publication | Newsletter or outlet | Stratechery, Every, Semi-Structured, Practical Data Modeling |
Topic | Concept/area the doc is about | ”harness thesis”, “data quality”, “state ownership”, “BEAM methodology” |
Framework | Named framework we reference or build | MAC, BEAM, Scope×Basis, Seven Powers, Entangled Software |
Project | RDCO work stream | MG Progress, Sanity Check, autoinv, MAC content series, Graph DB eval |
Sponsor | Entity sponsoring content | Anthropic, OpenAI, Box, Estuary |
Organization | Company/institution | phData, MG, Mammoth Growth, Anthropic, Foundation Capital |
Skill | RDCO skill file | process-newsletter, audit-model, postgrid, morning-prep |
Cluster | Thematic grouping of docs | ”harness-thesis dissent”, “data quality sources”, “moat debate” |
Decision | Significant decision doc | phData vs MG, Mode A vs Mode B, SQL guardrail adoption |
Source | Original external document | book PDFs, podcast transcripts, X articles |
Edge types (typed relations)
Semantic relations between documents
| Edge | Direction | Semantics |
|---|---|---|
validates | A → B | A provides evidence supporting B’s claim |
contradicts | A → B | A disputes B’s central thesis |
disputes-claim-in | A → B | A disputes a specific claim in B (not whole doc) |
refines | A → B | A sharpens B’s framework without contradicting |
extends | A → B | A builds on B, takes further |
synthesizes | A → (B, C, D…) | A combines multiple sources into a view |
echoes | A → B | A independently reaches B’s conclusion |
applies-framework-from | A → B | A uses the framework defined in B |
cites | A → B | A explicitly references B |
mentions-in-passing | A → B | Weaker than cites |
Personship and identity
| Edge | Direction | Semantics |
|---|---|---|
authored-by | Document → Person | — |
published-in | Document → Publication | — |
sponsored-by | Document → Sponsor | Disclosed sponsorship |
affiliated-with | Person → Organization | — |
role-of | Person → Role (CEO, founder, analyst) | — |
Topical
| Edge | Direction | Semantics |
|---|---|---|
about-topic | Document → Topic | Primary subject |
mentions-entity | Document → Entity | Named entity reference |
part-of-series | Document → Series | e.g., ADWD ch1 → “Agile Data Warehouse Design” book |
part-of-cluster | Document → Cluster | e.g., Moura → “harness-thesis dissent” |
Strategic / operational
| Edge | Direction | Semantics |
|---|---|---|
supports-position | Document → RDCO position | Evidence for a strategic stance |
informs-decision | Document → Decision | Input that shaped a decision outcome |
test-case-for | Case study → Framework | Concrete example of framework application |
example-of | Document → Pattern/Concept | Illustration of a pattern |
Provenance
| Edge | Direction | Semantics |
|---|---|---|
generated-by-skill | Document → Skill | Which skill produced this vault entry |
derived-from-source | Document → Source | Which external doc this note is assessing |
produced-by-project | Document → Project | Which RDCO project created this artifact |
Edge metadata (properties on edges)
Every edge should carry:
created_at(when the edge was asserted)confidence(high / medium / low)source(how we know — manual assertion, inferred from cross-check, extracted from wikilink)evidence_quote(optional, for validates/contradicts)
Query catalog (what this graph must answer)
These are the queries that justify the graph over QMD-only retrieval:
- Cascading contradictions — multi-hop contradiction chains
- Dissent cluster aggregation — which authors appear in ≥N dissent docs
- Source authority ranking — validates/contradicts edges received per author
- Framework adoption trajectory — which docs apply which framework, over time
- Positioning evidence compilation — docs that support-position a given stance
- Derivation path — shortest path between two conceptually linked docs
- Gap detection in a cluster — subtopics with insufficient sources
- Person citation network — inbound + outbound citation traversal
- Skill improvement provenance — findings → improvements → skill changes (two-hop)
- Decision-evidence audit — all docs that informed a given decision
Prototype scope (first 3 queries to prove out)
- Positioning evidence compilation (query #5) — “Find all vault entries that support the state-ownership architecture”
- Dissent cluster aggregation (query #2) — “Which authors appear in ≥3 dissent docs?”
- Decision-evidence audit (query #10) — “What evidence informed the phData vs MG decision?”
These three cover the most immediate operational use cases.
Versioning
This dictionary is iterative. When a query surfaces a need for a new edge type, add it here with the date. When an edge type turns out to be redundant or never-used after 2+ months, deprecate it here.
| Date | Change | Reason |
|---|---|---|
| 2026-04-14 | Initial dictionary | Founder asked for running data dictionary during graph DB eval scoping |
| 2026-04-14 | Renamed Author → Person | Author is a role relative to a doc. Person is the entity. Covers non-authors too (colleagues, clients, friend-of-founder contacts like the brother-in-law or vertical-farm friend). The authored-by edge still exists and connects Document → Person. |