Kent Beck - “Tidy First?” (2024)
Sources: the book is paywalled (O’Reilly, 124 pages, three parts: Tidyings / Managing / Theory). Beck’s “Software Design: Tidy First?” Substack at tidyfirst.substack.com is the open companion. The publisher’s table of contents and the Wikipedia/secondary summaries are the primary sources for this assessment, with quotes only from open surfaces.
Why this is in the vault
This morning’s verify-action build was a literal small-step ship (test fixtures, then minimal implementation, then 6/6 pass, then PR-shape ship with founder review left as a separate step), and Beck’s structural-vs-behavioral split is the doctrine that names what we did so we can keep doing it.
The core argument / doctrine
Beck’s central distinction in Tidy First?: structural change is not the same as behavioral change. A change to a system either modifies what it does (behavioral) or modifies how it is organized without modifying what it does (structural). The two should not ride in the same commit.
A tidying is a small, safe, reversible structural change. Beck catalogs 15 tidyings in Part I of the book - guard clauses, dead code removal, normalize symmetries, new interface / old implementation, reading order, cohesion order, move declaration and initialization together, explaining variable, explaining constant, explicit parameters, chunk statements, extract helper, one pile (and breaking it apart), explaining comments, delete redundant comments. Each tidying is small enough that the cost of doing it is bounded and the cost of undoing it is bounded.
The decision framework, the load-bearing asset of Part II, is First, After, Later, Never:
- Tidy first if the tidying makes the upcoming behavioral change clearly easier and the tidying is genuinely small.
- Tidy after if you discover the tidying mid-feature; finish the behavior change, then tidy in a separate commit.
- Tidy later if the tidying is real but not blocking; queue it.
- Tidy never if the code in question won’t be touched again or the tidying doesn’t pay back.
The economic argument in Part III: structural improvements have option value; they make future changes cheaper. But they have a cost today, and the present-value calculation has to favor the option for the tidying to be worth doing now. Beck’s epigram: “a dollar today is worth more than tomorrow.” (Under 15 words, from publisher overview.)
The unit-of-ship discipline that falls out of all of this: separate commits / PRs for tidyings vs feature work. A reviewer reading a tidying-only diff confirms “behavior unchanged” by reading the tidying recipe; they don’t need to think about whether the change is correct, only whether the recipe was followed. A reviewer reading a feature-only diff confirms “behavior changed correctly” without having to mentally subtract out the cosmetic changes. Mixing the two erases both efficiencies.
A short Beck line from the book’s positioning: “Make large changes in small, safe steps.” (Eight words; from the publisher overview.) The whole book is an instance of that principle applied recursively.
Mapping against Ray Data Co
The verify-action build was structurally Beckish, and we should be deliberate about it. Recap of what shipped this morning:
- Wrote test-fixtures.json. Structural change (new file, no behavior yet). Reversible by deletion.
- Wrote verify-action.py until 6/6 fixtures passed. Behavioral change (new code that will, when wired, gate tool calls).
- Wrote run-tests.sh, SKILL.md, proposed-settings.json.snippet. Structural change (scaffolding around the behavior).
- Did not edit
~/.claude/settings.jsonto wire the hook. The activation step is left to the founder as a separate, reviewable, reversible commit.
That last move is the load-bearing one in Beck’s frame. Activating the hook is a behavioral change of the system (the harness’s tool-dispatch behavior changes the moment settings.json is edited). Bundling the activation with the code build would have mixed structural and behavioral changes in a single ship, which is exactly what Beck’s discipline says not to do. The skill’s “Activation steps for founder” section is the explicit handoff that keeps the structural and behavioral changes on separate commits.
The “First, After, Later, Never” framework as a queue discipline at RDCO. The vault has a 02-sops/ directory and a Notion task board; both are queues. Beck’s framework gives them a vocabulary:
- Tidy first. Apply when a vault structural cleanup makes an upcoming content build easier (e.g., creating a
06-reference/transcripts/subdirectory before starting a YouTube ingest run). - Tidy after. Apply when a content build surfaces a structural opportunity (e.g., the indy-dev-dan note revealed that “verifier-agent” deserves a dedicated 05-concepts/ article; that tidy should ship as its own note, not as an edit to the source).
- Tidy later. Notion-board candidate. Real but non-blocking structural improvements queue here.
- Tidy never. The clearest filter against the founder’s tendency toward gold-plating. If a tidy doesn’t pay back (we won’t touch this dataset again, this skill is going to be deprecated, this concept doesn’t connect to anything), Beck’s framework explicitly endorses leaving it alone.
Connection to the existing RDCO ship-rules. Three feedback memories converge on Beck’s doctrine and should be named as variants of it:
feedback_no_em_dashesis a structural/behavioral disambiguation in disguise. Replacing an em dash with a hyphen is structural (no semantic change); changing prose to avoid the em dash entirely is behavioral. The rule is: prefer the structural change unless the structural change degrades meaning, in which case do the behavioral change. That is a tidy-first call.feedback_no_claudemd_state_driftis the inverse: do not let workspace state (which is behavioral, time-varying) leak into CLAUDE.md (which is structural, stable). The rule enforces the separation Beck draws between the two kinds of change.feedback_pr_only_workflowis the unit-of-ship discipline made concrete. Branch + PR + autonomous review/merge is the RDCO realization of “small safe steps with separate commits for separate concerns.” The PR boundary is where Beck’s structural / behavioral split gets exercised by the reviewer.
MAC framework angle. The Scope x Basis matrix’s “Definition of Done” coverage checklist is itself a tidying applied to data engineering. Promoting the matrix from “good idea” to “must-pass before ship” is a structural change to the pipeline (how we organize the model build) that does not change what the pipeline does. It is the team-process analogue of extract helper. Naming this in Beck’s vocabulary helps the founder talk about why MAC is “a tidying” rather than “a process tax.”
The deeper claim to internalize. Beck’s 2024 book is the maturation of a forty-year career in small-step discipline. RDCO’s autonomous loop is built on the same primitive: every shipped change is one tidy or one feature, every PR is one concern, every skill build follows the test-fixtures-first-then-implementation-then-activation rhythm. We did not derive this from Beck; we built it because it was the only thing that worked. Naming the lineage gives us a teacher to consult when the discipline is being tested.
Related
- 2026-05-05-beck-tdd-by-example - the 2002 ancestor; red-green-refactor as the small-step discipline applied to writing code
- 2026-05-05-feathers-working-effectively-with-legacy-code - seams as the structural-change vocabulary
- 2026-05-05-tdd-is-dead-debate-dhh-beck-fowler - the debate that Tidy First? quietly resolves by separating concerns
- 2026-05-05-hughes-quickcheck-property-based-testing - properties as a tidying applied to test suites
- ~/.claude/skills/verify-action/SKILL.md - this morning’s small-step ship; the artifact that prompted this note
- 2026-04-15-commoncog-becks-measurement-model - other Beck content already in the vault (Hillel Wayne / Cedric Chin variant)
- ../01-projects/data-quality-framework/testing-matrix-template.md - Scope x Basis matrix as data-engineering tidying
- feedback_no_em_dashes - structural-vs-behavioral disambiguation in writing
- feedback_no_claudemd_state_drift - workspace-state vs stable-rules separation
- feedback_pr_only_workflow - unit-of-ship discipline at the repo level