Lessons from Building Claude Code: Seeing like an Agent — Thariq
Thariq (@trq212), an Anthropic engineer on the Claude Code team, published this February 27, 2026. It’s an insider account of how the Claude Code team designs tools for agents — what worked, what failed, and the evolving principle they landed on: see like an agent.
The article covers three case studies from building Claude Code: the AskUserQuestion tool, the TodoWrite-to-Tasks evolution, and the search interface progression from RAG to progressive disclosure.
The Core Framework: Design Tools Shaped to the Model’s Abilities
The analogy that opens the piece: imagine being given a difficult math problem. What tools would you want? Paper lets you work but limits you to manual calculation. A calculator is better but requires knowing which functions to use. A computer is most powerful but requires knowing how to write and execute code. The right tool depends on your own skills.
The same logic applies to agents: give them tools shaped to their abilities, not your imagination of what they should be able to do. To know what those abilities are, you pay attention, read outputs, experiment. You learn to see like an agent.
Case Study 1: The AskUserQuestion Tool (Elicitation)
Problem: Claude could ask questions in plain text, but answering them took too long and created friction. How do you increase the bandwidth of communication between user and Claude?
Attempt 1 — Modify ExitPlanTool: Add a questions array alongside the plan. Failed because Claude couldn’t reconcile questions and plan simultaneously. Confused the model.
Attempt 2 — Change output format: Ask Claude to output a modified markdown format with questions. Claude would append extra sentences, omit options, or use a different format. Not guaranteed.
Attempt 3 — Dedicated AskUserQuestion tool: Claude calls it at any point (especially in plan mode). It triggers a modal, blocks the agent loop until the user answers, and prompts Claude for structured output with multiple options. This worked — not because it was technically elegant, but because Claude seemed to like calling it.
Key lesson: The best-designed tool doesn’t work if Claude doesn’t understand how to call it. Tool design is partly about creating affordances that fit the model’s natural tendencies, not just the user’s needs.
Unresolved: Is AskUserQuestion the final form of elicitation? Probably not — what works for one model may not work for another as capabilities improve.
Case Study 2: TodoWrite → Task Tool (Updating with Capabilities)
When Claude Code launched, the model needed a TodoWrite tool to track state. Todos were written at the start and checked off as work progressed. System reminders every 5 turns kept Claude on track.
As Opus 4.5 improved, the pattern inverted: reminders made Claude think it had to stick to the list rather than modify it. The model was better than the scaffolding. Opus 4.5 also got significantly better at using subagents — but subagents couldn’t coordinate on a shared TodoWrite list.
The solution: replace TodoWrite with the Task tool. Tasks include dependencies, share updates across subagents, and the model can alter or delete them. The key difference: Todos were about keeping the model on track; Tasks are about helping agents communicate with each other.
Key lesson: As model capabilities increase, tools that once helped may start constraining. Constantly revisit previous assumptions about what tools are needed. The right tool for one capability level is wrong for the next. This is also why it’s useful to support a small set of models with a similar capability profile — tool design is model-specific.
Case Study 3: Search Interface (Progressive Disclosure)
Early Cowork/Claude Code: RAG vector database for context retrieval. Powerful and fast but required indexing and setup, fragile across environments. Most importantly: Claude was given context rather than finding it itself.
The shift: Give Claude a Grep tool and let it search the codebase directly. This worked and revealed a principle: as Claude gets smarter, it becomes increasingly good at building its own context if given the right tools. The agent shouldn’t receive context passively — it should be able to construct it actively.
Progressive Disclosure formalized: When Agent Skills were introduced, this became an explicit pattern. Claude reads skill files; those files reference other files Claude can read recursively. Skills add more search capabilities — instructions for how to use an API, how to query a database. Claude went from barely building its own context to doing nested search across several layers to find exactly what it needs.
The Claude Code Guide subagent: Claude didn’t know enough about itself (what an MCP is, what a slash command does). Rather than stuffing the system prompt with docs (causing context rot), they built a Guide subagent. Claude is prompted to call it when asked about itself. The subagent has extensive instructions on searching docs and what to return. This is progressive disclosure applied to self-knowledge: “add things to Claude’s action space without adding a tool.”
An Art, Not a Science
No rigid rules. Tool design depends heavily on the model being used, the goal of the agent, and the environment it operates in. Experiment often, read your outputs, try new things. See like an agent.
The bar to add a new tool is high — every tool is one more option for the model to reason about. Claude Code has ~20 tools and constantly asks whether all of them are necessary.
Connections
- 06-reference/2026-04-06-claude-agent-sdk-guide — the Agent SDK exposes the same engine that powers Claude Code; the tool design principles here apply directly to SDK-built agents.
- 06-reference/2026-04-07-claude-code-architecture-teardown — the four-layer architecture (model, context, harness, infrastructure) is the structural frame this article fills in with concrete examples.
- 06-reference/2026-03-15-claude-architect-course — the Claude Architect certification curriculum; tool design is a core component.
- 06-reference/2026-04-04-claude-code-best-practices — practitioner’s guide to Claude Code that complements this insider account.
- 06-reference/2026-01-19-arscontexta-vault-agent-series — vault agent architecture; progressive disclosure principles apply to how skills and context are structured in this vault.
- 06-reference/2026-04-04-talking-to-agents-is-all-you-need — “context is the new moat” connects to the shift from passive RAG to active context-building described here.
Actionable for RDCO
The progressive disclosure pattern is directly applicable to how our skills in ~/.claude/skills/ are structured. Each skill should reference other files it might need rather than loading everything upfront — exactly the pattern Thariq describes.
The TodoWrite → Task evolution is a signal to watch: as Claude’s capabilities improve (Opus 4.6 → future models), tools that currently feel essential may start to constrain. The agent architecture should be reviewed each time a major model upgrade ships.
The “see like an agent” principle is the right frame for designing prompts, context files, and skill files: build scaffolding shaped to what the model can actually do, not what you wish it could do. The AskUserQuestion story is a useful calibration — sometimes the model will naturally gravitate toward a good behavior if you give it the right affordance.