Day 1 — Your tests are in one cell

Subject line options:

Your dbt tests are all in one cell
The 3am question nobody can answer
Count your not_nulls

{{REVIEW: three subject line options. Pick one or tell me which direction you want — the third is the most "sanity check" voice, the first is the most literal.}}

Last month I pulled up a dbt project and ran grep -c 'not_null\|unique\|accepted_values' schema.yml across every model.

87% of the tests in the repo.

Not 87% of the models had those tests — 87% of the total test count was those three macros. A few relationships tests, a handful of custom SQL tests, and that's the whole testing surface for a business doing eight figures in ARR.

This isn't a bad team. I've seen the same shape at most of the clients I work with. The senior analytics engineer is smart, dbt-fluent, and genuinely cares. The test coverage still looks like this because the shape of what's easy to test is a single column in isolation against a fixed rule. That's what schema.yml was designed to express, and the path of least resistance is a powerful force.

Here's the problem.

A dbt model has three places a bug can hide:

A column — one field is wrong.
A row — the fields are individually fine but they don't cohere. Closed-won rows with a future close date. New-business opportunities with zero ACV. A Salesforce row with no sf_id.
An aggregate — every row passes every check and the total is wrong. A join fanned out. A filter silently dropped records. A window function double-counted.

And four or five places the bug could be relative to:

An absolute rule you wrote down.
The upstream source.
The production dashboard or finance ledger.
An external reconciliation — Stripe, the bank, a vendor export.
History.

not_null, unique, and accepted_values all sit in one cell of that space. Column scope, Absolute basis. That cell is important. It is not the whole building.

The 3am question I can't answer with that cell: did the total number we served last night match the total number we served last Tuesday, and if not, why? That's aggregate scope, temporal basis. Nothing in a standard schema.yml will tell you.

The harder question: does the closed-won revenue in our pipeline model reconcile to the transaction model row-for-row? Aggregate, Relative:Production. Same problem.

The hardest one: for this column that gets conditionally overridden — do we test against the right source in each case? Column, Relative:Source, conditional. This is the one that ate three weeks of my life last month.

If your testing lives entirely in the column/absolute corner, the rest of the space is uninspected. Every bug that hides there will reach production and sit there until a human happens to notice. Which, if your consumers are increasingly agents rather than humans, they won't.

Tomorrow I'll walk the first axis — Scope. Column vs. Row vs. Aggregate, what each one catches, and the quick diagnostic to figure out which scope your current tests are actually covering.

Before then, one question worth sitting with:

Of the last three data bugs that reached your stakeholders — what scope were they? And what basis would have caught them?

Hit reply and tell me. I read every one.

— Ben

{{REVIEW: sign-off. You used "— Ben" casually in the relaunch essay; using it here. Change if you want Mr. Ben / no sign-off / full name.}}

Day 1 of 7. Tomorrow: The Scope axis — Column vs Row vs Aggregate.