06-reference

commoncog whats operational definition

Tue Apr 14 2026 20:00:00 GMT-0400 (Eastern Daylight Time) ·reference ·source: Commoncog ·by Cedric Chin

“What’s an Operational Definition Anyway?” — @CedricChin

Why this is in the vault

This is the foundational piece on how metrics should be defined before they are ever measured — and it maps almost one-for-one onto the row-label layer of Ray Data Co’s MAC (Model Acceptance Criteria) matrix. If MAC is the “what to check” framework, operational definitions are the discipline that makes each check reproducible, comparable across teams, and resistant to silent redefinition. Strong mapping.

The core argument (paraphrased)

Chin opens with a friend’s story: a product team quietly changed “active user” from >$100 spend to >$10 spend — active user counts jumped, and “it was actually a pretty smart move” for hitting quarterly targets. Chin’s horrified response anchors the essay: without recorded metric definitions, every number is rewritable.

Drawing from Donald Wheeler’s Making Sense of Data, he compresses SPC data-collection practice to two rules:

  1. “Unless all the values in a set are collected in a consistent manner, the values will not be comparable.”
  2. “It is rarely satisfactory to use data collected for one purpose for a different purpose.”

Chin’s added third rule: every metric must come as an operational definition (OD) — three parts:

Key moves in the essay:

Mapping against Ray Data Co

Mapping strength: strong. This article is functionally the documentation layer that makes the MAC framework defensible.

1. ODs ARE the row-label discipline of the MAC 3×6 matrix. MAC’s 18 cells (scope × basis) describe which category of check applies to a metric. But each cell only becomes executable when the underlying metric has an operational definition — criterion, test procedure, decision rule. Without ODs, a Stop/Pause/Go severity call is undefended: if “active user” can be silently redefined, every threshold is meaningless. MAC without OD discipline is a framework without a foundation. See ../01-projects/data-quality-framework/testing-matrix-template.

2. The agent-deployer’s first job is writing ODs. Per 2026-04-14-levie-agent-deployer-role-jd, the agent-deployer instruments AI workflows and manages evals. Evals are operational definitions for AI output quality — criterion (“factual accuracy on customer-support responses”), test procedure (“LLM-as-judge against gold set X sampled monthly”), decision rule (“response counts as accurate if it matches gold on intent and cites a real policy”). Chin’s framework gives us the template. The agent-deployer skill should probably include an OD authoring step before any dashboard is built.

3. State-ownership requires versioned ODs. The ../04-tooling/rdco-state-ownership-architecture thesis says the client owns the vault + skills + data. Chin’s friend’s story is the dark-mirror version: when ODs aren’t written down, the team can silently rewrite history. For RDCO, the vault must store each client’s ODs with git-style versioning — so when “active user” changes from $100 to $10, the old metric is deprecated, the new one is a separate metric, and dashboard comparisons are broken by design. This is a concrete vault-schema recommendation.

4. phData/MG angle: ODs are the moat against “data-driven theatre.” In consulting engagements (phData-style delivery), the first observable win with a new client is often cleaning up metric definitions — usually 30-60% of dashboards compare values collected inconsistently. Chin’s Rules 1 and 2 give a reusable audit checklist. The MAC engagement opens with an OD audit; the OD audit typically finds more surface-level problems than MAC itself.

5. Harness implication: ODs belong in skills, not prompts. Per the Tan/Levie fat-skills thesis, reusable discipline should live in versioned skill files. An author-operational-definition skill — taking a metric name and walking through criterion/procedure/rule — would be a natural addition to the agent-deployer toolkit. Low effort, high leverage.