06-reference

commoncog becoming data driven first principles

Tue Apr 14 2026 20:00:00 GMT-0400 (Eastern Daylight Time) ·reference ·source: Commoncog ·by Cedric Chin

“Becoming Data Driven, From First Principles” — @CedricChin

Why this is in the vault

This is the cornerstone piece of Cedric Chin’s Becoming Data Driven series and the most direct intellectual parent the vault has for the ../04-tooling/rdco-state-ownership-architecture and the MAC (Model Acceptance Criteria) framework. Chin walks from Deming/SPC first principles to the Amazon Weekly Business Review, and along the way articulates why data-driven-ness is a discipline to be learned, not a stance to be declared. Every argument in this piece maps onto what RDCO is trying to build as a consulting posture. Filed with high priority — this one gets referenced in future vault work.

The core argument (paraphrased)

The purpose of data is knowledge — specifically, theories or models that let you predict the outcomes of your business actions.

Chin’s framing: most “data-driven” businesses collect numbers, set KPIs, and run the company as a black box — “spending and process and labour combine to serve customers to spit out profit the other end” — without any causal model of how the parts connect. When targets miss, they doctor the numbers or rationalize (“100% of OKRs means you weren’t ambitious enough”). “Running their businesses on superstition” — Deming’s phrase.

The alternative is Deming’s Statistical Process Control (SPC), also known as continuous improvement or quality engineering. Chin summarizes it as “a style of thinking with a few tools attached.” The tools (process behaviour charts, XmR charts, operational definitions) are secondary to the thinking style.

The thinking style, compressed:

  1. Epistemology over truth. Deming argued there is no truth in business, only knowledge. Knowledge is evaluated on predictive validity, is conservative (updates when reality changes), and is therefore safer than truth. Businesses die because leaders mistake contingent knowledge for permanent truth.
  2. Variation is routine until proven exceptional. Most business metrics show random variation around a process mean. The XmR chart (process behaviour chart) gives you statistically principled limits to distinguish routine variation (don’t react) from exceptional variation (investigate, because the process has changed).
  3. You must earn the right to criticize data-driven-ness. The people who dismiss “being data driven” by arguing for “data informed” or “running on vibes” usually have nothing viable to offer. The critique is only credible from operators who already know how to use data properly.
  4. The payoff is a causal model of your business in your head. When you launch a marketing campaign, hire an engineer, or change incentives, you are implicitly predicting outcomes. The XmR chart is how you check whether your prediction was correct, and therefore how you learn.
  5. From SPC to Amazon’s WBR. The Amazon Weekly Business Review is an applied instantiation of these ideas — a pre-packaged metrics practice refined through years of trial. You can run it as a recipe, but the real value is the underlying principles, because those let you come up with equivalently powerful mechanisms for your own context.

Chin is explicit: this essay sets up the two-part WBR deep-dive. But it’s the more important piece, because the WBR is one implementation of the principles; the principles generate other implementations.

Mapping against Ray Data Co

This article is directly the intellectual foundation for what RDCO is building. Five mappings:

1. MAC (Model Acceptance Criteria) is SPC for AI-era data models. The MAC framework — 3×6 matrix of scope (column/row/aggregate) × basis (absolute/rel-source/rel-production/rel-recon/temporal/human) with Stop/Pause/Go severity tiers — is structurally identical to what Chin describes: routine vs exceptional variation, with response protocols tied to severity. The differences are (a) MAC extends SPC beyond manufacturing into data models, and (b) the “exceptional variation” response in MAC includes LLM-augmented investigation rather than just human inspection. See ../01-projects/data-quality-framework/testing-matrix-template.

2. “The purpose of data is knowledge” is the state-ownership thesis in one sentence. RDCO’s ../04-tooling/rdco-state-ownership-architecture argues that the client owns the vault + skills + data, and the AI model is a commodity. Chin’s framing is adjacent: what the operator owns is a causal model of their business — the vault is how that model persists across sessions. The SPC/WBR discipline is how the model gets updated when reality shifts.

3. The agent-deployer role is the modern SPC operator. Per 2026-04-14-levie-agent-deployer-role-jd, enterprises need someone who understands data flows, instruments AI workflows, and manages evals. That is functionally the same role Chin describes — operator running Weekly Business Reviews — but applied to AI agent outputs instead of manufacturing lines. MAC is the WBR for agents.

4. “You must earn the right to criticize data-driven-ness” is the defensive frame against harness-thesis dissent. When critics (e.g. Moura’s entangled-software thesis, 2026-04-13-moura-entangled-software-agent-harnesses-dead) argue harnesses are unnecessary because models will just do it all, Chin’s reply applies: the critique is only credible from people who have run the discipline and know where it breaks down. The harness-thesis is the data-literacy equivalent — you don’t get to argue against it until you’ve tried operating without it.

5. The cross-client consulting upside is Amazon’s WBR pattern. Chin notes that the WBR is the pre-packaged implementation you can apply immediately if willing to put in the work. RDCO’s consulting posture (per 2026-04-14-levie-agent-deployer-role-jd Posture 2: Playbook + Coaching) is exactly this: hand the client a pre-packaged MAC practice + skills, coach them for 3-6 months, leave them with a discipline they own. The MAC drip course is the WBR-in-a-box equivalent.

One thing this article implicitly challenges us on: Chin’s core warning is that most businesses run on superstition because they’ve never been taught basic data literacy. The MAC framework only works if the operator has enough statistical-thinking literacy to not treat every spike as a crisis. Training that literacy is part of the consulting engagement — it’s not a “here’s a skill, you’re done” handoff. Worth baking into the coaching curriculum.