01-projects / data-quality-framework / case-studies

gold opp pipeline mg progress

Sun Apr 12 2026 20:00:00 GMT-0400 (Eastern Daylight Time) ·case-study ·status: active

Case Study: gold_opp_pipeline (MG Progress)

First production application of MAC. Canonical example for the drip course and consulting engagements.

Context

Why this is the canonical example

  1. Real production complexity: 30+ columns, 3 source systems, conditional override logic (PRO-303 COALESCE pattern), cross-model reconciliation requirements (gold_opp_pipeline ↔ gold_txn_pipeline ↔ gold_pbi_arr_schedule).

  2. MAC surfaces a real bug class: The dual-path pattern for closed-won vs open pipeline is exactly the kind of issue that conventional testing misses. Same column, same row, but the Relative:Source target depends on is_closed_won. The matrix forces explicit reasoning about this per cell.

  3. UAT integration proof: Existing 3,227-row UAT FAIL (“Opp txn type must align for Closed Won”) maps cleanly to a specific matrix cell (R1 Row × Relative:Production for the opportunity_type column). The framework retrofits existing client findings without redesign.

  4. Covers all three scopes meaningfully:

    • Column × 6 basis = ~30 columns × 6 cells = 180 cells evaluated (many marked n/a or skip)
    • Row scope = 8 cross-column rules (R1-R8)
    • Aggregate scope = 6 rules (A1-A6), 3 with cross-model joins
  5. 14 existing tests → ~95 total coverage: Demonstrates the typical gap between current-state dbt testing and MAC-level rigor (~85% gap).

The PRO-303 Pattern (key insight)

The founder’s gold_opp_pipeline has a COALESCE override: for closed-won opportunities, projected measures (O6), contract linkage (O5), classification (O7), and expansion flags (O8) are overridden with actuals from silver_fct_combined_transaction. Open pipeline retains projected values from Salesforce/OFC.

This creates a dual-source testing requirement that the MAC matrix catches by construction:

Conventional schema.yml-based testing would never surface this gap because it treats each column as having a single source. The matrix forces the engineer to ask “what’s the source for this cell UNDER THIS CONDITION?” — and the answer reveals the override pattern needs two tests, not one.

Coverage Summary

ScopeExistingNewTotal
Column × Absolute12~15 (accepted_values, format checks, conditional not_null)~27
Column × Relative:Source2~30 (pass-through fidelity, PRO-303 override validation)~32
Column × Relative:Production0~5 (dashboard reconciliation)~5
Column × Temporal0~10 (distribution stability, null rate drift)~10
Column × Human0~5 (manual spot-checks)~5
Row (all basis types)08 (R1-R8)8
Aggregate (all basis types)26 (A1-A4, A6 + A5 temporal)8
Total14~79~95

Key Rules Worth Highlighting in the Drip Course

What this case study proves

  1. MAC is applicable to real production dbt models — not just toy examples.
  2. The matrix surfaces bugs by construction — the dual-path pattern isn’t something you’d think to test; the framework forces it.
  3. Severity tiers (Stop/Pause/Go) allow all tests to run at once without alert fatigue — agents triage; humans review exceptions.
  4. The artifact is a consulting deliverable — this document, sent to a client, immediately communicates “you’re missing 80%+ of your coverage.”