Kent Beck - “Test-Driven Development: By Example” (2002)
Why this is in the vault
This morning’s ~/.claude/skills/verify-action/ build was textbook red-green-refactor (test fixtures written first, implementation iterated until 6/6 fixtures passed), so the canonical TDD doctrine needs to live in the vault as the named ancestor of how we now ship verification code at RDCO.
The core argument / doctrine (from the 2002 text)
Beck reduces TDD to two operating rules and a three-step inner loop. The two rules are stated in the Preface (p. ix) and restated in the Introduction (p. xix):
- Write new code only if an automated test has failed.
- Eliminate duplication.
These two rules generate a programming cycle Beck calls Red / Green / Refactor (Beck 2002, Preface, p. x). He writes the cycle multiple times across the book; the most cited form (Part I opener, p. 1, and Ch. 17 “Process”) is:
- Add a little test.
- Run all tests and see the new one fail.
- Make a little change.
- Run all tests and succeed.
- Refactor to remove duplication.
Quick green excuses all sins, but only for a moment (Beck 2002, ch. 2). The point of step 3 is to get the bar green by any means - a constant, a hard-coded answer, copy-paste - because having a known-passing baseline is the safety net that lets the structural change in step 5 be reversible.
The test list
Before writing code, Beck has the developer write a list of every test the feature will need to pass before it can be called done (Beck 2002, ch. 1; pattern formalized in ch. 25 as “Test List”). The list is a TODO that gets crossed off as red turns green. Tests that surface mid-cycle are added to the list, not chased immediately. The to-do list pattern, with bold-when-active and strike-through-when-done, is shown literally in Ch. 1 of the Money Example. This is what protects the loop from blowing up in scope.
The three approaches to going from red to green
In the “Money Retrospective” (Beck 2002, ch. 17, “One Last Review”), Beck names exactly three approaches to making a test pass cleanly. They are detailed in ch. 28 “Green Bar Patterns”:
- Fake It (‘Til You Make It). First implementation after a red bar returns a constant. Once green, gradually transform the constant into an expression using variables. Beck attributes the power of Fake It to two effects: psychological (green feels different from red - you can refactor with confidence) and scope control (one concrete example beats premature generalization).
- Triangulate. Drive abstraction conservatively by adding a second example. Only abstract when at least two assertions force it. Beck explicitly says he reserves Triangulate for when he is “really, really unsure” of the right abstraction; otherwise he prefers Fake It or Obvious Implementation (Beck 2002, ch. 28, “Triangulate”).
- Obvious Implementation. When you know what to type and can do it quickly, just type it in. Beck warns this is “second gear” - downshift to Fake It or Triangulate the moment a red bar surprises you (Beck 2002, ch. 28, “Obvious Implementation”).
The three live on a single axis of step size; together with Test List and One Step Test (ch. 26) they form the actual operational core of the book.
What Beck does NOT codify in this book
The frequently cited “four rules of simple design” (passes tests, expresses intent, no duplication, fewest elements) are NOT enumerated as a numbered list anywhere in the 2002 text. Beck states only the two operating rules (above). The four-rules-of-simple-design framing is from his earlier work (Smalltalk Best Practice Patterns, XP Explained 1999) and from Fowler’s bliki. The 2002 book is narrower: two rules, one cycle, three approaches, plus the patterns catalog.
The TDD patterns catalog (Part III index structure only)
Beck organizes the patterns into named groups. The group structure (Beck 2002, Chs. 25-31):
- Ch. 25 Test-Driven Development Patterns: Test (noun), Isolated Test, Test List, Test First, Assert First, Test Data, Evident Data
- Ch. 26 Red Bar Patterns: One Step Test, Starter Test, Explanation Test, Learning Test, Another Test, Regression Test, Break, Do Over
- Ch. 27 Testing Patterns: Child Test, Mock Object, Self Shunt, Log String, Crash Test Dummy, Broken Test, Clean Check-in
- Ch. 28 Green Bar Patterns: Fake It, Triangulate, Obvious Implementation, One to Many
- Ch. 29 xUnit Patterns: Assertion, Fixture, External Fixture, Test Method, Exception Test, All Tests
- Ch. 30 Design Patterns: Command, Value Object, Null Object, Template Method, Pluggable Object, Pluggable Selector, Factory Method, Imposter, Composite, Collecting Parameter
- Ch. 31 Refactoring patterns
Beck’s stated motivations (Preface)
Beck frames TDD as a way to manage fear during programming (Beck 2002, Preface, “Courage”, p. xi). Fear makes you tentative, makes you communicate less, makes you avoid feedback. Tests are the teeth of a ratchet that lets you rest between bouts of cranking on a hard problem - once a test is green, that piece of progress is locked in. The deeper claim is in Ch. 32 “Mastering TDD”: TDD shortens the feedback loop on design decisions from weeks-to-months down to seconds-to-minutes. Coverage is a downstream effect, not the goal.
Mapping against Ray Data Co
The verify-action build this morning followed Beck’s loop exactly, and we can now name the steps with primary-source vocabulary.
- We wrote
~/.claude/skills/verify-action/test-fixtures.jsonfirst - 6 cases. This is Beck’s Test List (ch. 25): every test we knew we needed to pass before calling the rule corpus done, written down before any implementation existed. The R001..R005 numbering froze the list into a file. - Each fixture started red. Implementation in
~/.claude/scripts/verify-action.pywas Obvious Implementation (ch. 28) for the simple rules: a literal regex for the U+2014 codepoint (R001) is exactly the case where Beck says “just type it in” - we knew what to type and we could do it quickly. Where we hard-coded patterns to get green and then generalized, that was Fake It (ch. 28). - After all fixtures passed, we extracted the
RULESregistry and the per-tool dispatch. That is the “remove duplication” step of the cycle - rule #2 of Beck’s two rules.
The Scope x Basis matrix in ~/rdco-vault/01-projects/data-quality-framework/testing-matrix-template.md is the data-output analogue of Beck’s Test List. Where Beck says: enumerate every test you’ll need before writing code; the matrix says: enumerate every check (one per scope, one per basis) before writing the model. The “Definition of Done” coverage checklist is the matrix’s version of “the list is empty, so we are done.” Naming the lineage matters because it tells future model authors why the matrix is non-negotiable rather than a tax.
Where verify-action diverges from Beck (the load-bearing distinction). TDD in Beck’s sense is design pressure: writing the test first forces the developer to invent the API they wish they had, which improves the design (Beck 2002, ch. 32, “Why does TDD work?”). The verify-action runtime hook is post-hoc - it does NOT shape the design of the LLM behavior, it polices the output. We are on the COVERAGE side of the cycle (rule #1 of Beck’s two rules: never ship code that doesn’t have a passing test) but not on the DESIGN-PRESSURE side. That distinction is the architectural choice from 2026-05-05-tdd-is-dead-debate-dhh-beck-fowler and is deliberate.
Beck’s Test (noun) framing predicts our hook. Ch. 25 opens by distinguishing test-the-verb (poking buttons, looking at output) from test-the-noun (a procedure that runs automatically and produces a boolean pass/fail). The verify-action hook is Beck’s “Test (noun)” promoted from compile-time-on-the-developer’s-laptop to runtime-on-Ray’s-tool-call. Same primitive, new substrate.
Rule corpus discipline = Beck’s “fewest elements” instinct, even though that rule isn’t enumerated in this book. v1 has five rules and two are warn-only. The temptation will be to grow the corpus to dozens; Beck’s general posture (across this book, ch. 32 in particular) is that you only add a rule when a real failure forced its existence. The skill’s “Adding a new rule” section (“Add at least one fixture, positive + negative”) locks the red-green-refactor discipline into the rule-extension flow.
Where this primary-source upgrade changed the prior reconstruction
- The “four rules of simple design” claim was wrong as attributed. The prior reconstructed-from-web version listed four numbered rules (passes tests, expresses intent, no duplication, fewest elements) as if they were in this book. They are not. The 2002 text states only two operating rules. The four-rules framing comes from earlier Beck work and from Fowler’s bliki. Corrected here.
- The “I get paid for code that works, not for tests” Beck quote attributed via DHH does not appear in this book. The prior version cited it as if it were canonical. It is from Beck’s other surfaces (interviews, c2 wiki). Removed - this note now only cites what is actually in the 2002 text.
- “Fake it till you make it” is the parenthetical of the actual pattern name, not the pattern name itself. Beck calls it “Fake It (‘Til You Make It)” in ch. 28. Minor, but the prior version used the colloquial form.
- The Test List pattern is more disciplined than the prior version implied. Beck specifies that the list contains: examples of every operation, null versions of operations that don’t exist yet, and refactorings expected to be needed by end of session. He has both a “now” list and a “later” list and explicitly describes moving items between them. The prior version compressed this into “brainstorm the tests.”
Related
- 2026-05-04-indy-dev-dan-pi-coding-agent-reviews-like-you - the verifier-agent pattern this morning’s build instantiates
- ~/.claude/skills/verify-action/SKILL.md - the artifact that came out of this loop
- 2026-05-05-feathers-working-effectively-with-legacy-code - the seam concept that lets the verifier retrofit onto an untested system
- 2026-05-05-tdd-is-dead-debate-dhh-beck-fowler - where verify-action sits in the TDD-debate (coverage side, not design-pressure side)
- 2026-05-05-hughes-quickcheck-property-based-testing - the property-based v2 direction for the rule corpus
- 2026-05-05-beck-tidy-first-2024 - Beck’s 2024 framing of small-step ship discipline (separate structural from behavioral change)
- ../01-projects/data-quality-framework/testing-matrix-template.md - the Scope x Basis matrix as the data-output Test List equivalent
- 2026-04-15-commoncog-becks-measurement-model - other Beck content already in the vault
- ../04-tooling/personal-library-index.md - this note is the primary-source assessment indexed there