Michael Feathers - “Working Effectively with Legacy Code” (2004)

Note on sources: the book is paywalled. This assessment is reconstructed from the Wikipedia entry, the c2.com wiki seams pages, Feathers’ conference talks (notably “Working Effectively with Legacy Code” at NDC and his 2010 talks on the Deep Synergy), and the often-quoted introduction. The book is cited as the canonical primary source.

Why this is in the vault

The audit-newsletter-outputs.py pattern and the verify-action PreToolUse hook are both Feathers seams in disguise (deterministic verification layers spliced between the LLM-generated output and the action that commits it), and naming the seam doctrine in the vault gives us the architectural primitive to keep building these retrofits on legacy LLM behavior.

The core argument / doctrine

Feathers’ opening move is a definition that is provocative on purpose: legacy code is code without tests. Not old code, not bad code; code without tests. The definition reframes the problem: “how do I get this dangerous codebase under control?” becomes “how do I get a test around this piece of behavior so I can change it safely?”

The whole book is about that one question. The answer is the seam.

A seam is a place in the code where you can change behavior without editing in that place. Seams are how you insert tests around code that was never written to be testable. Feathers catalogs seam types by language:

Preprocessor seams (C/C++ macros). You can swap out an entire definition by redefining the macro in the test build.
Link seams. You replace a linked library or object file with a test version, leaving call sites unchanged.
Object seams (the most powerful in OO languages). A method call on an object is a seam if you can substitute a different object implementing the same interface. This is the seam that virtual methods, interfaces, and dependency injection make available.

Each seam has an enabling point: the place where you control which implementation is in effect. For a preprocessor seam, the enabling point is the build flag. For an object seam, the enabling point is whoever constructs the object that gets passed in.

Feathers then catalogs dependency-breaking techniques that introduce seams safely when none exist. The named moves include: Extract Interface (turn a concrete dependency into an interface so a test fake can be substituted), Sprout Method / Sprout Class (put new behavior into a fresh, fully-tested unit and call it from the legacy code, leaving the legacy code untouched), Parameterize Constructor (lift a dependency out of the constructor into a parameter so the test can pass a fake), Encapsulate Global References, Subclass and Override Method, and many more. Each technique is paired with a recipe: pre-conditions, mechanical steps, what to test before and after.

The discipline has a companion called the characterization test: when you don’t know what the legacy code is supposed to do, write a test that captures what it actually does today. Run it. Whatever passes becomes the documented behavior. The test isn’t aspirational; it pins down the current truth so refactoring can begin.

A short Feathers line that survives across the book and his talks: “Tests are the lever you didn’t know you had.” (Under 15 words.) The seam is the fulcrum that makes the lever work.

Mapping against Ray Data Co

The audit-newsletter-outputs.py script is a Feathers seam, and we should name it that way. The header of that script (~/.claude/scripts/audit-newsletter-outputs.py) literally says: “PURE PYTHON deterministic code. Makes ZERO LLM calls.” That is the seam in operation. The “legacy code” in this analogy is /process-newsletter itself - an LLM-driven workflow that nobody can fully test because the LLM is non-deterministic. The audit script splices in at a specific seam: the boundary between the LLM-produced markdown file and its acceptance into the vault. At that seam, deterministic invariants (frontmatter shape, required sections, sponsored=bool, no duplicate (source, date, topic) tuple) are enforceable in ordinary Python. The 13 invariants I1..I13 in that file are characterization tests that pinned down what /process-newsletter is actually supposed to produce, then promoted that knowledge to a contract. Action implication: every existing LLM-driven skill that produces a structured artifact (vault notes, Notion writes, draft-replies, design-critic verdicts) has an equivalent seam available, and we should treat that seam as the natural home for a deterministic audit step.

The verify-action PreToolUse hook is also a Feathers seam. The harness’s tool-dispatch loop is “legacy code” in the Feathers sense - we cannot rewrite the harness to be more testable, and the LLM driving it cannot be unit tested. But the harness exposes a seam: the PreToolUse hook fires between Ray’s decision to call a tool and the actual invocation. At that seam, deterministic Python (~/.claude/scripts/verify-action.py) inspects the tool input and either passes or blocks. The R001..R005 rules are characterization-tests-as-rules: each one pins down a behavior the founder has already corrected (“no em dashes,” “Discord external requests must @-mention founder,” “iMessage chat_id format”) and promotes that correction from a memory note into an executable check. The seam is what makes this retrofit possible. Without the PreToolUse hook surface, we would have no enabling point to splice in.

Sprout Class is the right pattern for new RDCO verifier capabilities. Per Feathers: when you want to add behavior to legacy code, you don’t edit the legacy code; you sprout a new, fully tested class and call into it. Translated: when we want to add a new check (rule R006, say “no PII in outbound message”), we do not edit the harness or the LLM. We add a function to verify-action.py with its own fixtures, drop it into the RULES registry, and the existing seam handles dispatch. This is exactly the “Adding a new rule” section in ~/.claude/skills/verify-action/SKILL.md: it is Sprout Method codified into a skill.

Connection to MAC framework. The Scope x Basis matrix is a different kind of test catalog (data outputs vs code), but the underlying philosophy is the same: rather than making the LLM or the dbt model “more testable,” we identify the seams (column boundary, row boundary, aggregate boundary, source boundary, recon boundary) and insert deterministic checks at each. Feathers wrote the book on retrofitting tests; the matrix is its data-engineering descendant.

The scope of the doctrine to internalize: every untested behavior at RDCO has a seam somewhere. Find the seam, insert deterministic Python, characterize the behavior, promote characterization to contract. That is the architectural primitive Feathers gives us; everything else (audit-newsletter-outputs, verify-action, future audit-* scripts) is an instance.

~/.claude/scripts/audit-newsletter-outputs.py - the canonical RDCO seam example (frontmatter / structure invariants)
~/.claude/skills/verify-action/SKILL.md - the PreToolUse seam for outbound message verification
2026-05-04-indy-dev-dan-pi-coding-agent-reviews-like-you - verifier-agent pattern; the LLM-driven cousin of the deterministic seam
2026-05-05-beck-tdd-by-example - the design-pressure ancestor; this note’s companion
2026-05-05-tdd-is-dead-debate-dhh-beck-fowler - where the deterministic-seam approach sits in the broader TDD debate
../01-projects/data-quality-framework/testing-matrix-template.md - Scope x Basis matrix as a data-engineering seam catalog
2026-05-04-karlmehta-llm-commoditization-intelligence-rails - the orchestration-layer thesis where seams accumulate value

Michael Feathers - “Working Effectively with Legacy Code” (2004)

Why this is in the vault

The core argument / doctrine

Mapping against Ray Data Co

Related