06-reference

amazon wbr metric trees business stack view research

Wed Apr 29 2026 20:00:00 GMT-0400 (Eastern Daylight Time) ·research-brief ·status: canonical

Amazon WBRs + Metric Trees + Business-Stack View — research and HQ design spec

Why this exists

Founder ask, 2026-04-30 (verbatim):

It’d be great if the HQ gave us a way to see each business stack and how the processes flowed.

  • instrumentation - list of what our inputs are
  • tools - what actions can we take
  • targeting system - what is our evaluation criteria
  • feedback - a list of decisions for how we updated our process (added instrumentation, how we used a tool, etc.)

Then if we click on the targeting system we should be able to see a business process DAG for how the tools and readings generate the outcome that our targeting system tracks. The feedback decision traces could be in a table below that.

I’m thinking Amazon WBRs and Metric Trees. There has to be more research into this style of management out there.

This shouldn’t all live on the HQ dashboard. I would think each small bet gets a different page.

The founder’s four-layer model — instrumentation / tools / targeting / feedback — is a clean OODA-style restatement of the Amazon WBR + Metric Tree pattern, applied at the granularity of a single small bet rather than a full company. The job of this brief is to (1) sanity-check that intuition against the existing literature, (2) lift the load-bearing patterns from WBRs and Metric Trees, and (3) translate them into a concrete HQ feature spec that fits the existing Astro + Hono+Bun stack.

Research findings: WBRs, Metric Trees, and adjacent management styles

Amazon Weekly Business Reviews (WBR)

Source-of-truth in the vault: 2026-04-15-commoncog-amazon-weekly-business-review and 2026-04-15-commoncog-working-backwards. Cross-referenced against Bryar/Carr’s Working Backwards and Holistics’ WBR breakdowns.

Load-bearing structure (compressed):

  1. Three goals, in strict order. (1) What did our customers experience last week? (2) How did our business do last week? (3) Are we on track to hit targets? Customer-first framing is deliberate — internal-first framing biases the metric set toward vanity.

  2. Controllable input metrics ≫ output metrics. Output metrics (revenue, DAU/MAU, FCF) are what you care about, but operationally you are not allowed to discuss them. You drive output by hunting for controllable input metrics — directly actionable levers — and tracking whether they still move the output. When an input stops correlating, you discard it and find a new one. The causal model lives in the input→output graph, and it’s expected to evolve. Amazon’s canonical example: the “selection” input metric evolved through # detail pagesdetail page viewsin-stock viewed pagesFast Track In Stock over multiple WBR cycles as each prior version stopped correlating with sales.

  3. Anomaly-led narrative. Every metric in the deck gets one second of stare-time. Routine variation earns “nothing to see here” and the meeting moves on. Only exceptional variation gets discussed — and the metric owner must either explain it or say “I don’t know, still investigating.” Fabricated explanations forbidden.

  4. Three visualization types, forever. The 6-12 graph (trailing 6 weeks + trailing 12 months on the same axis, prior-year ghost line, target triangles, box scores), the 6-12 table, plain tables. Same fonts, same colors, same layout every week — fingertip-feel requires repetition.

  5. DMAIC lifecycle per metric. Define → Measure → Analyze → Improve → Control. Every metric on the deck has an owner, a definition document, an audit trail, and a reason to live. Metrics get retired when they stop measuring what they claim.

  6. Six-Pager memos replace slide decks for substantive discussion. Read silently for ~20 minutes at the top of any non-WBR meeting. Forces the writer to construct a coherent argument rather than hide behind bullet structure.

Metric Trees

The directly relevant complementary pattern. Sources: Mixpanel (“Metric trees 101”), Count.co (“Intro to metric trees”), and Paul Levchuk’s “Metric Tree Trap” critique.

Definition: a metric tree is a hierarchical decomposition of a top-level (north-star) metric into the operational drivers that produce it. Edges represent causal or arithmetic influence (e.g., signups = traffic × conversion × acceptance). Structurally a DAG — a node can have multiple parents — but typically presented as a tree for readability.

Three layers in canonical form:

Build process (Mixpanel/Count consensus):

  1. Define the north star
  2. Decompose into 2-3 components recursively until you hit something operationally controllable
  3. Assign owners per leaf
  4. Make it a living ritual — reviewed quarterly minimum, evolved when a leaf stops correlating with its parent

Critical tension surfaced by Levchuk: mathematical decomposition can obscure causal structure. A clean equation tree (revenue = price × volume) is not the same as a causal tree (what moves volume?). The Amazon WBR sidesteps this by demanding that the leaf metric be both controllable AND empirically correlated with the output — not just arithmetically related. This is exactly the founder’s instinct in saying the “targeting system” should be the trackable outcome and the DAG should show how tools and readings generate that outcome. He’s asking for a causal metric tree, not an arithmetic one.

Adjacent operating-rhythm patterns

What to adopt for RDCO

RDCO is a 1-person portfolio of 3-5 small bets (Squarely, MAC, Sanity Check, Mother of All Cron candidate, etc.). Full-Amazon WBR overhead is wrong-sized — you can’t run a 60-minute weekly meeting with 400 metrics across 5 bets when you’re solo. Three patterns to lift:

  1. Controllable-input-metric discipline at the per-bet level. Each bet’s targeting system is its output metric (P&L for Squarely; reliability + adoption for MAC; subscriber LTV for Sanity Check). The instrumentation layer is the controllable inputs we believe drive that output. The decision-trace table records when an input stopped correlating and we replaced it. This is the WBR pattern compressed to a single-bet, asynchronous-review form factor.

  2. Decision-trace table as the WBR substitute. Instead of a Wednesday meeting, a chronological log per bet recording: date, decision, which layer changed (instrumentation/tools/targeting), observed outcome shift, why we made the call. Founder reviews on his own cadence (weekly skim, quarterly deep). This preserves the audit-trail benefit of the WBR without the meeting overhead.

  3. Causal DAG view (not arithmetic decomposition). The targeting-system drill-down should show the causal flow: which tools produce which intermediate readings, which readings feed the targeting metric. This is the Levchuk-correct version of a metric tree — not just revenue = price × volume arithmetic, but we run X tool → it generates Y reading → which we believe influences Z target. This makes the DAG editable as the causal model evolves (Amazon’s “input metric retired” pattern).

Patterns explicitly not worth porting: the 6-12 graph (overkill for solo cadence), the six-pager memo (already covered by vault notes), the Bar Raiser hiring pattern (no hiring), the dot collector (no team).

Design spec for HQ “Business Stack” feature

Information architecture

Targeting-system drill-down view

Click on the targeting system block → opens hq.raydata.co/bets/<slug>/targeting:

Data model (proposed)

// Each bet is one record. Stored in a Notion DB ("RDCO Bets") for founder-edit ergonomics.
interface Bet {
  id: string;
  slug: string;          // squarely, mac, sanity-check, ...
  name: string;
  thesis: string;
  status: 'active' | 'paused' | 'sunset';
  isCriticalComponent: boolean;
  vaultProjectPath: string; // 01-projects/<name>/
}

// Three child collections per bet (Notion sub-DBs or relation fields):
interface Sensor {
  betId: string;
  name: string;
  source: string;
  currentValue?: string | number;
  lastUpdate?: string;
  isControllable: boolean;
  retiredOn?: string;      // when we retired it; null if live
  retiredByDecisionId?: string;
}

interface Tool {
  betId: string;
  name: string;
  status: 'built' | 'partial' | 'gap';
  lastInvocation?: string;
  playbookPath?: string;
}

interface DecisionTrace {
  betId: string;
  date: string;
  decision: string;
  layer: 'instrumentation' | 'tools' | 'targeting';
  whatChanged: string;
  observedOutcomeShift?: string;
  vaultNotePath?: string;
}

// Targeting + DAG
interface Targeting {
  betId: string;
  outputMetric: string;
  currentValue?: string;
  target: string;
  cadence: 'weekly' | 'monthly' | 'quarterly';
}

interface DagEdge {
  betId: string;
  fromNode: string;        // sensor.name OR tool.name OR intermediate reading
  toNode: string;
  hypothesis: string;      // the causal claim
}

Implementation considerations

HQ stack today (per ~/.claude/state/hq-phase3.5-status.md):

Decisions to make for this feature:

  1. Storage location. Recommendation: Notion DB (“RDCO Bets” with linked sub-DBs for Sensors, Tools, DecisionTraces, Targeting, DagEdges). Founder edits in Notion — same pattern as the existing Notion task board — and the API reads via the Notion MCP equivalent. Rationale: founder needs to add/retire sensors and log decisions on his own time; he won’t open a CLI to do it. Notion is the surface he already uses. Local SQLite on Mac Mini is faster but creates an editing-friction tax that will kill the loop.

  2. Editor responsibility. Founder owns the targeting and bet roster (those are judgment calls). Ray auto-appends decision-trace rows whenever a skill ships a meaningful update to a bet (e.g., when /check-board closes a bet-tagged task, when a deploy lands, when a SC issue ships). Ray drafts; founder approves on review. Sensors and tools are jointly-editable.

  3. DAG rendering. Phase B: Mermaid (declarative, no JS framework cost, renders fine in Astro via the existing prose styles). Phase C: revisit react-flow if the DAGs get complex enough that interactive editing matters. Mermaid covers the “show me the causal flow” use case at near-zero implementation cost.

  4. New API endpoints (Phase A):

    • GET /api/bets — list of bets for the navigator
    • GET /api/bets/<slug> — full bet record with sensors, tools, targeting, decision traces
    • GET /api/bets/<slug>/dag — DAG nodes + edges for the drill-down (Phase B)
  5. New Astro routes:

    • src/pages/bets/index.astro — bet navigator (also embedded as a pane on the HQ home)
    • src/pages/bets/[slug].astro — per-bet stack page
    • src/pages/bets/[slug]/targeting.astro — targeting drill-down + DAG + decision traces

Phased build

Phase A (MVP, ~2-3 days):

Phase B (~1-2 days after A is in use):

Phase C (~as instrumentation matures):

Open questions for founder

  1. Decision-trace ownership: Ray-auto-logged or founder-manual? Recommendation is hybrid (Ray drafts, founder approves), but if you want the trace table to be only founder-curated (highest signal, lower volume), say so before Phase A — it changes the API write surface.

  2. Phase A scope: all 3-5 bets at once, or pilot with one (Squarely)? Pilot is faster to ship and iterates the schema before we lock it in across the portfolio. Big-bang is more impressive but bakes in schema mistakes. Default recommendation: pilot Squarely first, then port MAC + SC once the shape feels right.

  3. DAG editor: Phase B or never? If you only need to view the DAG, Mermaid is enough. If you want to drag-edit nodes (add a new sensor visually), we need react-flow and Phase B grows. The initial spec assumes view-only; founder may want editing.

  4. Cadence enforcement: passive or active? Should HQ surface “your Squarely WBR-equivalent is 9 days overdue” as a Decision-Needed item, or stay passive and trust founder to open the page when he wants? The Amazon discipline says active. Founder discipline preference may differ.

Changelog