06-reference / research

mac vs published data quality frameworks

Sat Apr 18 2026 20:00:00 GMT-0400 (Eastern Daylight Time) ·research-brief ·source: deep-research

Where MAC Fits in the Data-Quality Acceptance-Framework Landscape

The question

What published data-quality acceptance frameworks exist beyond Great Expectations and dbt-expectations (Soda, Monte Carlo, Datafold, Anomalo, Bigeye), and how do their “acceptance criteria” definitions compare structurally to MAC’s 3×6 Scope × Basis matrix?

We care because the founder’s Friday note flagged that MAC is not yet proven and needs multiple implementations. Credibility for the Client Reporting + MAC content series depends on showing MAC is a coherent member of the category — not a one-off — while still saying something the category does not.

What we already know (from the vault)

What the web says

Soda (SodaCL). YAML-based domain-specific language. ~25-50+ built-in metrics (freshness, volume, schema, distribution, missing values, RAD record-level anomaly detection). Lives in Git next to transformations. Soda Cloud layers data contracts on top — codified producer/consumer agreements. Organization: flat library of declarative checks per dataset. No published 2D taxonomy.

Monte Carlo. Sells “ML-powered anomaly detection” across five pillars: freshness, volume, schema, distribution, lineage. Recently extended to “Data + AI Observability” (model inputs, agent behavior, output drift). Auto-configured monitors; no user-authored rules required for the baseline. Organization: flat — five-pillar marketing taxonomy, ML under the hood.

Datafold. Differs from the others — primary primitive is data-diff (compare table A vs table B across environments / before-and-after a PR). Plus column-level profiling and anomaly detection. Strongest in CI/CD workflows (catch regressions before merge). No published structural taxonomy of check types.

Anomalo. Unsupervised ML across full datasets — explicitly markets “no rules to define.” ML learns historical structure and trends, surfaces anomalies. Distribution shifts, pattern changes, full table scans. Organization: anti-taxonomy by design — the pitch is that you don’t author or organize tests, the system does.

Bigeye. Auto-monitor creation per table. 70+ pre-built data quality metrics. ML-suggested anomaly thresholds. Five-pillar coverage (freshness/volume/schema/distribution/lineage). In their 2022 blog they propose four flat categories: Data Observability, Data Transformation, Data Testing, Data Lineage — a flat market-segmentation list, not a 2D test-design matrix.

Great Expectations (baseline). Library of “Expectations” — each is a named, parameterized assertion (expect_column_values_to_not_be_null, expect_column_pair_values_A_to_be_greater_than_B, etc.). ~300+ expectations organized informally by what they assert on (column, multi-column, table). Naming convention implies a scope dimension but no published basis dimension. Organization: flat library with naming-convention scope grouping.

dbt tests / dbt-expectations (baseline). Native dbt: four generic tests (unique, not_null, accepted_values, relationships). dbt-expectations ports the GE library to dbt. Organization: flat library, augmented by severity: warn|error config — the closest dbt-native analog to MAC’s Stop/Pause/Go tiering.

Convergences and contradictions

Where the vendor landscape converges:

Where it diverges:

Where MAC’s 18 cells map to vendor coverage:

CellVendor coverage
Column × AbsoluteUniversal (commodity)
Column × Rel:SourceDatafold (column diffs); GE (cross-table expectations)
Column × Rel:ProductionSparse — manual SQL
Column × Rel:ReconEssentially none
Column × TemporalStrong — every vendor’s distribution / null-rate monitor
Column × HumanNone
Row × AbsoluteGE / Soda / dbt-expectations multi-column rules
Row × Rel:SourceDatafold row-diff
Row × Rel:ProductionSparse
Row × Rel:ReconManual
Row × TemporalSparse — vendors trend aggregates, not row-level changes
Row × HumanNone
Aggregate × AbsoluteSome (Soda metric checks); often hand-coded
Aggregate × Rel:SourceDatafold; manual elsewhere
Aggregate × Rel:ProductionManual SQL — no vendor primitive
Aggregate × Rel:ReconManual SQL — no vendor primitive
Aggregate × TemporalUniversal (the “five pillars” core)
Aggregate × HumanNone

Roughly: vendors are dense in the Temporal column (especially Aggregate-Temporal) and the Column-Absolute cell. They are sparse-to-absent across the four reconciliation cells (Rel:Production and Rel:Recon at any scope), the Row-Temporal cell, and the entire Human column.

Synthesis for RDCO

MAC is a coherent member of the category, with one genuinely novel structural move. The category exists — Soda, Monte Carlo, Datafold, Anomalo, Bigeye, Great Expectations, dbt-expectations all sell some version of “acceptance criteria for data” — and MAC speaks the same vocabulary (column / row / aggregate, completeness / precision / fidelity, severity tiers). It is not a fringe framework that requires re-educating buyers from zero. The Rogojan / Reis / Bigeye conversations all assume a reader who already accepts that data needs tests. MAC slots into that reader’s brain.

The novel move is the explicit 2D taxonomy with Basis as a first-class axis. Every vendor and every published framework I found organizes tests as a flat library — sometimes with informal scope grouping (GE), sometimes with a marketing taxonomy (“five pillars”, “four categories”), sometimes ML-discovered (Anomalo). None of them surface a cross-product of where you’re checking × what you’re checking against. Because of that, vendor users tend to over-index on the cells the vendor makes easy (Column-Absolute, Aggregate-Temporal) and silently skip the cells the vendor doesn’t surface (Row-Temporal, anything Recon, Human). MAC’s 3×6 grid forces test design through 18 categories rather than through whatever the tool’s check-library default happens to be. This is the same move Kimball’s bus matrix made for dimensional modeling: it doesn’t add new primitives, it makes the coverage gaps visible.

MAC’s three real differentiators against the vendor landscape:

  1. Reconciliation cells are first-class. Rel:Production and Rel:Recon are the cells where consulting work actually lives — tying the warehouse to the finance ledger, to Stripe, to the bank. The vendor landscape treats these as custom integrations. MAC treats them as required cells in the matrix. That is the explicit “Definition of Done” wedge for the Client Reporting series.
  2. Human Sanity Check is named. The vendor landscape pretends quality is fully automatable. MAC names the unautomatable cell so teams don’t drop it once they “have tests now.” This is consistent with the Justin Nixon “data quality as a product with SLAs” comment from the SDG note — SLAs imply a human stakeholder.
  3. Layer-aware severity. MAC’s bronze/silver/gold severity remapping has no direct analog in any vendor product. Soda has tags, dbt has warn|error, Monte Carlo has alert routing — but none of them prescribe the same check has different severity at different medallion layers. This is an opinion the vendors don’t have, and opinions sell consulting.

Repositioning recommendation. Lead the MAC content series with “the matrix the vendors don’t ship.” Frame Soda / Monte Carlo / Anomalo as partial coverage of MAC (each covers some cells well, none covers all cells) rather than as competitors. Position MAC as the build-time coverage discipline that determines what to ask any of those tools to monitor — not as a replacement for them. This matches the ../2026-04-14-joe-reis-state-of-data-modeling-april-2026 read: MAC is upstream of detection. The four cells that vendors structurally can’t sell (the three reconciliation cells + Human) are the strongest evidence that MAC is filling a real gap, not relabeling existing checks.

Open follow-ups

  1. Does Coalesce Quality (formerly SYNQ) or Sifflet ship anything closer to a 2D taxonomy than the five-pillar / flat-library norm? Worth a single-vendor deep dive before the anchor article.
  2. Has anyone published the “vendor coverage of MAC’s 18 cells” comparison as a public artifact? If not, that is a free-standing newsletter unit.
  3. What does Anomalo’s unsupervised ML actually catch in Aggregate × Rel:Source vs Aggregate × Temporal? The marketing collapses these; a real implementation comparison would sharpen the “ML doesn’t replace categories” argument.
  4. Where do the dbt 1.8+ unit tests fit in the MAC matrix? They feel like Row-Absolute with synthetic data — confirms or stresses the Row-Absolute cell’s coverage story.
  5. Is there a published academic taxonomy of DQ dimensions (Wang/Strong 1996 is the classic) that MAC should explicitly cite to anchor itself in the literature, not just the vendor landscape?

Sources

Vault:

Web: