“So you’ve been asked to ‘take over’ some old data pipeline…” — @RandyAu
Why this is in the vault
Founder shared 2026-04-21 09:28 EDT, no comment. Directly relevant to the MAC (Model Acceptance Criteria) content series and to drafted concept pages CA-017 (externalized cost) and CA-019 (design-for-controlled-decay). Au’s non-technical-first framework for pipeline inheritance is exactly the companion to MAC: MAC tells you when the pipeline is acceptable; Au tells you what to do when you inherit without MAC attached.
The core argument
When handed an orphaned data pipeline, do the context triage before the code review. Most inherited pipelines fall into one of three buckets:
- Unused — nobody actually depends on the output anymore
- Measuring the wrong thing — original KPI is obsolete or has drifted
- Built on outdated assumptions — stack, data sources, or org structure has changed
In all three cases, the right answer is often delete, not maintain. The experienced engineer’s cranky questioning (“who uses this? why?”) is the defense against accidentally inheriting maintenance burden for something nobody needs.
The framework (checklist)
- Assess context (non-technical first). Who used this? Why was it built? Does anyone still need it? Is there an internal champion who will fight to keep it alive, or did the owner leave two re-orgs ago?
- Evaluate current relevance. Should it exist at all? Should ownership shift to a different team? Does it need modernization or just decommission?
- Trace code logic. Only now — forward from entry points or backward from consumers. Map the DAG.
- Verify correctness. Does the pipeline actually do what its name / docstring / Slack descriptions claim? (Often: no.)
- Patch vs. rebuild decision. Emergency fixes first if production is burning. Then deeper rebuild vs. retire analysis.
- Document findings. Your investigation notes ARE the documentation. Write them in a way the next inheritor can read.
Mapping against Ray Data Co
MAC content series (direct): Au’s framework is the inheritor’s dual to MAC’s author’s framework. MAC says: “the model is acceptable when X, breaks when Y.” If the original author attached MAC, the inheritor reads the MAC card, re-validates, moves on or decommissions cleanly. If the original author didn’t attach MAC, the inheritor has to reverse-engineer it — exactly what Au’s checklist produces. Cite this in the MAC anchor piece as “the pipeline you’re inheriting right now, but didn’t write.” Suggested lead: “The pipeline you’re about to inherit has probably been lying to you for 18 months. Here’s how to prove it — and here’s the artifact the previous team should have left behind.”
Externalized cost (concepts/externalized-cost, CA-017, 7 sources canon-tier): inherited pipelines are the textbook engineering instance. The original team extracted the credit (shipped the feature, got the promotion), and externalized the maintenance cost to a future team that didn’t consent to the liability. Au’s line “you’re the last poor sap that’s touched it” is a clean operator-voice articulation of the pattern. Worth adding as an 8th source to CA-017.
Design-for-controlled-decay (concepts/design-for-controlled-decay candidate, CA-019, 4 sources ripe-undrafted): most inherited pipelines are failure-to-retire. They should have had an explicit sunset date, a consumer-count-gate, or a periodic “still needed?” review. The Fontana-slot-cutting analog at the data-pipeline layer. Au’s framework is the recovery tool for teams that didn’t design decay in. Worth citing in the CA-019 draft when I get to it.
Sanity Check candidate angle: this has direct newsletter legs. Hook: “The pipeline you’re about to inherit has probably been lying to you for 18 months. Here’s how to prove it.” Ties Au’s framework to MAC as the positive reframe: “if your team had shipped MAC alongside the pipeline, you wouldn’t be here.” Queue for SC v3 draft calendar.
Connects to the Felix build in a smaller way: when we eventually migrate the Mac Mini’s existing cron-script ecology to a more disciplined Felix-Tier-2 architecture, we’ll be inheriting from ourselves. Running Au’s checklist on our own scripts before Felix-migrating them is a dogfooding exercise.
Sources & bias
- Format: Counting Stuff is Randy Au’s weekly Substack, self-hosted on counting-stuff.com (canonical URL, not the substack redirect)
- Author: Randy Au — Quantitative UX researcher at Google, former data analyst. Strong operator voice in the “cranky experienced engineer” register. High-signal account, worth tracking.
- Sponsor: None detected. Self-hosted Ghost or similar, no ads visible.
- Per copy-paste caution: paraphrased throughout, quotes ≤15 words with marks.
Related
- 01-projects/data-quality-framework/testing-matrix-template — MAC anchor
- concepts/externalized-cost — CA-017, this is source #8
- concepts/layered-defense-architecture — CA-016
- concepts/verifier-as-epistemology — MAC is the verifier Au’s framework reconstructs
- 2026-04-14-joe-reis-state-of-data-modeling-april-2026 — Reis on what data teams are actually doing in 2026
- 2026-04-20-practical-data-modeling-ch15-people-organizations — Reis Ch15 on situational awareness at a new team
Tracked-author candidate
Randy Au (@RandyAu) — add to tracked-authors list. High-signal operator voice. Counting Stuff should be added to the F-list (follow-forward) at minimum in ~/rdco-vault/01-projects/process-newsletter/README.md. Worth considering promotion to K-list if the next 2-3 issues hold this quality bar.