Amazon WBRs + Metric Trees + Business-Stack View — research and HQ design spec

Why this exists

Founder ask, 2026-04-30 (verbatim):

It’d be great if the HQ gave us a way to see each business stack and how the processes flowed.

instrumentation - list of what our inputs are

tools - what actions can we take

targeting system - what is our evaluation criteria

feedback - a list of decisions for how we updated our process (added instrumentation, how we used a tool, etc.)

Then if we click on the targeting system we should be able to see a business process DAG for how the tools and readings generate the outcome that our targeting system tracks. The feedback decision traces could be in a table below that.

I’m thinking Amazon WBRs and Metric Trees. There has to be more research into this style of management out there.

This shouldn’t all live on the HQ dashboard. I would think each small bet gets a different page.

The founder’s four-layer model — instrumentation / tools / targeting / feedback — is a clean OODA-style restatement of the Amazon WBR + Metric Tree pattern, applied at the granularity of a single small bet rather than a full company. The job of this brief is to (1) sanity-check that intuition against the existing literature, (2) lift the load-bearing patterns from WBRs and Metric Trees, and (3) translate them into a concrete HQ feature spec that fits the existing Astro + Hono+Bun stack.

Research findings: WBRs, Metric Trees, and adjacent management styles

Amazon Weekly Business Reviews (WBR)

Source-of-truth in the vault: 2026-04-15-commoncog-amazon-weekly-business-review and 2026-04-15-commoncog-working-backwards. Cross-referenced against Bryar/Carr’s Working Backwards and Holistics’ WBR breakdowns.

Load-bearing structure (compressed):

Three goals, in strict order. (1) What did our customers experience last week? (2) How did our business do last week? (3) Are we on track to hit targets? Customer-first framing is deliberate — internal-first framing biases the metric set toward vanity.
Controllable input metrics ≫ output metrics. Output metrics (revenue, DAU/MAU, FCF) are what you care about, but operationally you are not allowed to discuss them. You drive output by hunting for controllable input metrics — directly actionable levers — and tracking whether they still move the output. When an input stops correlating, you discard it and find a new one. The causal model lives in the input→output graph, and it’s expected to evolve. Amazon’s canonical example: the “selection” input metric evolved through # detail pages → detail page views → in-stock viewed pages → Fast Track In Stock over multiple WBR cycles as each prior version stopped correlating with sales.
Anomaly-led narrative. Every metric in the deck gets one second of stare-time. Routine variation earns “nothing to see here” and the meeting moves on. Only exceptional variation gets discussed — and the metric owner must either explain it or say “I don’t know, still investigating.” Fabricated explanations forbidden.
Three visualization types, forever. The 6-12 graph (trailing 6 weeks + trailing 12 months on the same axis, prior-year ghost line, target triangles, box scores), the 6-12 table, plain tables. Same fonts, same colors, same layout every week — fingertip-feel requires repetition.
DMAIC lifecycle per metric. Define → Measure → Analyze → Improve → Control. Every metric on the deck has an owner, a definition document, an audit trail, and a reason to live. Metrics get retired when they stop measuring what they claim.
Six-Pager memos replace slide decks for substantive discussion. Read silently for ~20 minutes at the top of any non-WBR meeting. Forces the writer to construct a coherent argument rather than hide behind bullet structure.

Metric Trees

The directly relevant complementary pattern. Sources: Mixpanel (“Metric trees 101”), Count.co (“Intro to metric trees”), and Paul Levchuk’s “Metric Tree Trap” critique.

Definition: a metric tree is a hierarchical decomposition of a top-level (north-star) metric into the operational drivers that produce it. Edges represent causal or arithmetic influence (e.g., signups = traffic × conversion × acceptance). Structurally a DAG — a node can have multiple parents — but typically presented as a tree for readability.

Three layers in canonical form:

Root — the north-star or output metric (e.g., revenue, retention, P&L)
Branches — primary drivers (2-3 typically), the immediate decomposition
Leaves — the controllable input metrics teams act on day-to-day

Build process (Mixpanel/Count consensus):

Define the north star
Decompose into 2-3 components recursively until you hit something operationally controllable
Assign owners per leaf
Make it a living ritual — reviewed quarterly minimum, evolved when a leaf stops correlating with its parent

Critical tension surfaced by Levchuk: mathematical decomposition can obscure causal structure. A clean equation tree (revenue = price × volume) is not the same as a causal tree (what moves volume?). The Amazon WBR sidesteps this by demanding that the leaf metric be both controllable AND empirically correlated with the output — not just arithmetically related. This is exactly the founder’s instinct in saying the “targeting system” should be the trackable outcome and the DAG should show how tools and readings generate that outcome. He’s asking for a causal metric tree, not an arithmetic one.

Adjacent operating-rhythm patterns

Andy Grove iterative measurement (High Output Management, OKRs). Quarterly cadence, paired metrics (always measure quantity AND quality of any output to avoid Goodharting), and “indicators are most useful when they’re paired” — anticipates Amazon’s input/output discipline by ~25 years. The cadence lesson for RDCO: quarterly is right for goal-setting; weekly is right for instrumentation review; daily is too tight for a portfolio of small bets.
Bridgewater “dot collector” / Principles operating system. Real-time, attribute-tagged feedback on every meeting participant, fed into a believability-weighted decision audit log. The relevant lift here is the decision-trace pattern itself: decisions are recorded, attributed, and reviewable; outcomes are tied back to decisions later. RDCO doesn’t need believability weighting, but the founder’s “feedback table” is exactly this — a decision log that links each process update to the layer it tightened.
Bezos single-threaded leader (STL) pattern. Each major initiative has one leader, separable team, minimal dependencies. Implication for the HQ design: one bet = one page = one targeting system = one DAG. Don’t merge multiple bets into a portfolio dashboard; the founder already specified this. The STL pattern validates it.
OODA / closed-loop control theory in general. The founder’s four-layer breakdown (instrumentation / tools / targeting / feedback) maps onto Boyd’s OODA loop almost perfectly: observe (instrumentation), orient + decide (targeting + tools), act (tools), reflect (feedback). The literature consensus is that the feedback step is where most teams fail — they instrument, they have tools, they set targets, and then they never close the loop on what they learned. The decision-trace table is what closes the loop.

What to adopt for RDCO

RDCO is a 1-person portfolio of 3-5 small bets (Squarely, MAC, Sanity Check, Mother of All Cron candidate, etc.). Full-Amazon WBR overhead is wrong-sized — you can’t run a 60-minute weekly meeting with 400 metrics across 5 bets when you’re solo. Three patterns to lift:

Controllable-input-metric discipline at the per-bet level. Each bet’s targeting system is its output metric (P&L for Squarely; reliability + adoption for MAC; subscriber LTV for Sanity Check). The instrumentation layer is the controllable inputs we believe drive that output. The decision-trace table records when an input stopped correlating and we replaced it. This is the WBR pattern compressed to a single-bet, asynchronous-review form factor.
Decision-trace table as the WBR substitute. Instead of a Wednesday meeting, a chronological log per bet recording: date, decision, which layer changed (instrumentation/tools/targeting), observed outcome shift, why we made the call. Founder reviews on his own cadence (weekly skim, quarterly deep). This preserves the audit-trail benefit of the WBR without the meeting overhead.
Causal DAG view (not arithmetic decomposition). The targeting-system drill-down should show the causal flow: which tools produce which intermediate readings, which readings feed the targeting metric. This is the Levchuk-correct version of a metric tree — not just revenue = price × volume arithmetic, but we run X tool → it generates Y reading → which we believe influences Z target. This makes the DAG editable as the causal model evolves (Amazon’s “input metric retired” pattern).

Patterns explicitly not worth porting: the 6-12 graph (overkill for solo cadence), the six-pager memo (already covered by vault notes), the Bar Raiser hiring pattern (no hiring), the dot collector (no team).

Design spec for HQ “Business Stack” feature

Information architecture

Each small bet gets its own page at hq.raydata.co/bets/<slug> (per founder’s ask — bets are not the home dashboard).
The HQ home dashboard gets a single new pane: “Bets” navigator — a list of the 3-5 active bets, each linking to its own page. Critical-component bets get a star.
Per-bet page sections, top-to-bottom (one row each):

1. Header strip — Bet name, one-line thesis, current status (Active / Paused / Sunset), critical-component flag, last-decision-trace-update timestamp.

2. Instrumentation — list of inputs we’re reading. Per row:
- Sensor name (e.g., “Squarely weekly Stripe revenue”, “MAC quickstart-completion rate”)
- Source (Stripe API, Notion column, manual log, vault note path)
- Current value (if live-pullable) or “manual”
- Last update timestamp
- Is-controllable flag (boolean — does adjusting tools actually move this?)
3. Tools — list of actuators. Per row:
- Tool name (e.g., “Run Meta ad campaign”, “Ship MAC quickstart v2”, “Publish SC issue”)
- Status: Built / Partial / Gap
- Last invocation date (when applicable)
- Linked playbook / SOP path
4. Targeting system — the primary evaluation criteria. Single rich block:
- Output metric (e.g., “Squarely 90-day P&L > 0”, “MAC reliability ≥ 99% across X clients”, “SC subscriber LTV > acquisition cost”)
- Current value vs target
- Cadence (when do we judge it: weekly, monthly, quarterly?)
- Click-through: opens the Targeting-System Drill-Down view (below)
5. Feedback — chronological decision-trace table. Per row:
- Date
- Decision (one sentence)
- Layer touched (instrumentation / tools / targeting)
- What changed (added sensor X, retired tool Y, redefined target Z)
- Observed outcome shift (filled in retrospectively when known)
- Linked vault note (if longer reasoning lives there)

Targeting-system drill-down view

Click on the targeting system block → opens hq.raydata.co/bets/<slug>/targeting:

Top half: process DAG. Nodes = tools (left), readings (middle), targeting metric (right, terminal node). Edges = “tool produces reading” or “reading influences target.” Multi-parent edges allowed (real causal graphs aren’t trees). Hover on a node shows definition + current value. Hover on an edge shows the causal hypothesis (“we believe ad-spend → reach → signups → revenue”).
Bottom half: decision-trace table. Same shape as the per-bet feedback section but scoped to decisions that touched the targeting system specifically. Columns: date, decision, input/edge tweaked, observed outcome shift, vault link.
Editor affordance (Phase B): ability to mark a leaf-input as “retired — stopped correlating” with a pointer to the decision-trace row that retired it. This is the Amazon “input metric retirement” pattern made visible.

Data model (proposed)

// Each bet is one record. Stored in a Notion DB ("RDCO Bets") for founder-edit ergonomics.
interface Bet {
  id: string;
  slug: string;          // squarely, mac, sanity-check, ...
  name: string;
  thesis: string;
  status: 'active' | 'paused' | 'sunset';
  isCriticalComponent: boolean;
  vaultProjectPath: string; // 01-projects/<name>/
}

// Three child collections per bet (Notion sub-DBs or relation fields):
interface Sensor {
  betId: string;
  name: string;
  source: string;
  currentValue?: string | number;
  lastUpdate?: string;
  isControllable: boolean;
  retiredOn?: string;      // when we retired it; null if live
  retiredByDecisionId?: string;
}

interface Tool {
  betId: string;
  name: string;
  status: 'built' | 'partial' | 'gap';
  lastInvocation?: string;
  playbookPath?: string;
}

interface DecisionTrace {
  betId: string;
  date: string;
  decision: string;
  layer: 'instrumentation' | 'tools' | 'targeting';
  whatChanged: string;
  observedOutcomeShift?: string;
  vaultNotePath?: string;
}

// Targeting + DAG
interface Targeting {
  betId: string;
  outputMetric: string;
  currentValue?: string;
  target: string;
  cadence: 'weekly' | 'monthly' | 'quarterly';
}

interface DagEdge {
  betId: string;
  fromNode: string;        // sensor.name OR tool.name OR intermediate reading
  toNode: string;
  hypothesis: string;      // the causal claim
}

Implementation considerations

HQ stack today (per ~/.claude/state/hq-phase3.5-status.md):

Frontend: Astro + Tailwind at hq.raydata.co, deployed via Cloudflare Pages, behind Cloudflare Access (OTP for ben@raydata.co)
API: Hono+Bun read-only, running on Mac Mini at localhost:8787, exposed via Cloudflare Tunnel as hq-api.raydata.co, behind Access Service Auth via the Pages Function proxy at /api/[[path]].ts
Existing endpoints: /api/today, /api/feed, /api/decisions, /api/projects, /api/vault/search, /api/vault/doc, /api/cron, /api/health

Decisions to make for this feature:

Storage location. Recommendation: Notion DB (“RDCO Bets” with linked sub-DBs for Sensors, Tools, DecisionTraces, Targeting, DagEdges). Founder edits in Notion — same pattern as the existing Notion task board — and the API reads via the Notion MCP equivalent. Rationale: founder needs to add/retire sensors and log decisions on his own time; he won’t open a CLI to do it. Notion is the surface he already uses. Local SQLite on Mac Mini is faster but creates an editing-friction tax that will kill the loop.
Editor responsibility. Founder owns the targeting and bet roster (those are judgment calls). Ray auto-appends decision-trace rows whenever a skill ships a meaningful update to a bet (e.g., when /check-board closes a bet-tagged task, when a deploy lands, when a SC issue ships). Ray drafts; founder approves on review. Sensors and tools are jointly-editable.
DAG rendering. Phase B: Mermaid (declarative, no JS framework cost, renders fine in Astro via the existing prose styles). Phase C: revisit react-flow if the DAGs get complex enough that interactive editing matters. Mermaid covers the “show me the causal flow” use case at near-zero implementation cost.
New API endpoints (Phase A):
- GET /api/bets — list of bets for the navigator
- GET /api/bets/<slug> — full bet record with sensors, tools, targeting, decision traces
- GET /api/bets/<slug>/dag — DAG nodes + edges for the drill-down (Phase B)
New Astro routes:
- src/pages/bets/index.astro — bet navigator (also embedded as a pane on the HQ home)
- src/pages/bets/[slug].astro — per-bet stack page
- src/pages/bets/[slug]/targeting.astro — targeting drill-down + DAG + decision traces

Phased build

Phase A (MVP, ~2-3 days):

Notion DB schema set up (founder + Ray collaborate on initial population for Squarely, MAC, Sanity Check)
New /api/bets + /api/bets/<slug> endpoints in ~/rdco-hq-api/src/routes/
New Astro routes for navigator + per-bet page
Static layout (no DAG yet, no live sensor pulls — manual values in Notion)
Decision-trace table renders from Notion rows

Phase B (~1-2 days after A is in use):

Targeting drill-down route with Mermaid DAG rendered from DagEdges collection
Decision-trace table scoped to targeting-system changes
Auto-append decision-trace rows from /check-board and deploy hooks (Ray-side write paths into Notion)

Phase C (~as instrumentation matures):

Live sensor values pulled from underlying sources (Stripe webhook → cache → API; QMD vault counts; Notion column reads)
“Retired sensor” UI affordance with decision-trace linkage
Cadence reminders (per-bet WBR-equivalent: weekly skim due, quarterly deep due) surfaced on the HQ home

Open questions for founder

Decision-trace ownership: Ray-auto-logged or founder-manual? Recommendation is hybrid (Ray drafts, founder approves), but if you want the trace table to be only founder-curated (highest signal, lower volume), say so before Phase A — it changes the API write surface.
Phase A scope: all 3-5 bets at once, or pilot with one (Squarely)? Pilot is faster to ship and iterates the schema before we lock it in across the portfolio. Big-bang is more impressive but bakes in schema mistakes. Default recommendation: pilot Squarely first, then port MAC + SC once the shape feels right.
DAG editor: Phase B or never? If you only need to view the DAG, Mermaid is enough. If you want to drag-edit nodes (add a new sensor visually), we need react-flow and Phase B grows. The initial spec assumes view-only; founder may want editing.
Cadence enforcement: passive or active? Should HQ surface “your Squarely WBR-equivalent is 9 days overdue” as a Decision-Needed item, or stay passive and trust founder to open the page when he wants? The Amazon discipline says active. Founder discipline preference may differ.

2026-04-15-commoncog-amazon-weekly-business-review — canonical WBR reference in the vault
2026-04-15-commoncog-working-backwards — five Amazon mechanisms; controllable input metrics defined
2026-04-15-commoncog-becoming-data-driven-first-principles — SPC worldview underneath the WBR
../01-projects/hq-raydata-co/2026-04-22-hq-proposal-v1 — HQ v1 proposal (this feature is a v2 add)
../01-projects/data-quality-framework/testing-matrix-template — MAC as the controllable-input-metric layer for data models (one bet’s instrumentation system)
~/.claude/state/hq-phase3.5-status.md — current HQ stack (Astro + Hono+Bun, Cloudflare Pages + Tunnel, Access proxy)
../01-projects/squarely-puzzles/ — first candidate bet for the Phase A pilot
Memory: feedback_critical_component_field — Notion has the field; this feature surfaces it on the bet page

Changelog

2026-04-30 (Ray, draft for founder review) — initial research + spec in response to founder’s 04-30 ask