I Want to Become a Claude Architect (Full Course)
@hooeem on X reverse-engineered the Claude Certified Architect exam guide (a partner-only certification) and published what it tests. The framing: you don’t need the cert to build production-grade Claude applications — you need the knowledge. This note distills that knowledge.
The Exam at a Glance
Five domains, each weighted by exam share:
| Domain | Weight |
|---|---|
| Agentic Architecture & Orchestration | 27% |
| Claude Code Configuration & Workflows | 20% |
| Prompt Engineering & Structured Output | 20% |
| Tool Design & MCP Integration | 18% |
| Context Management & Reliability | 15% |
The exam is organized around six practical scenarios: a customer support resolution agent, code generation with Claude Code, a multi-agent research system, developer productivity tooling, Claude Code in CI/CD, and structured data extraction.
Domain 1: Agentic Architecture & Orchestration (27%)
The heaviest domain. The exam tests whether you understand how multi-agent systems actually work at the plumbing level, not just conceptually.
Anti-patterns the exam expects you to reject:
- Parsing natural language output to detect loop termination
- Using arbitrary iteration counts as the primary stopping mechanism
- Inspecting assistant text to infer completion status
The most common misunderstanding tested: assuming subagents share memory with the coordinator. They do not. Subagents have isolated context — every piece of information the coordinator needs to pass must be explicitly included in the prompt.
The safety rule: for financial or security-critical tool ordering, prompt instructions are insufficient. Enforcement must happen programmatically via hooks and prerequisite gates.
Key building block to understand: stop_reason handling, PostToolUse hooks for normalizing data, and tool call interception for policy enforcement.
Domain 2: Tool Design & MCP Integration (18%)
The exam focuses on description quality, tool_choice semantics, and scope management.
Tool descriptions are the primary selection mechanism. Vague or overlapping descriptions cause misrouting. The fix is better descriptions — not few-shot examples, not routing classifiers, not consolidation.
tool_choice options you must know cold:
auto— model may return text without calling a toolany— model must call a tool, picks which one- Forced selection — model must call a specific named tool
Scope discipline: giving an agent 18 tools degrades selection reliability. Each subagent should be scoped to 4–5 tools directly relevant to its role.
MCP specifics: project-level vs. user-level config (project config is version-controlled and shared; user-level is not), environment variable expansion in .mcp.json, and when to use community servers vs. custom builds.
Domain 3: Claude Code Configuration & Workflows (20%)
This is where practitioners diverge from power users.
CLAUDE.md hierarchy (three levels):
- User-level (
~/.claude/CLAUDE.md) — personal, not version-controlled - Project-level (
.claude/CLAUDE.mdorCLAUDE.mdin repo root) — shared with team - Directory-level (subdirectory
CLAUDE.md) — scoped to that subtree
The exam’s favourite trap: a team member missing instructions because they live in user-level config, which isn’t checked into the repo and doesn’t propagate to other developers.
Path-specific rules are underused: .claude/rules/ with YAML frontmatter glob patterns (e.g., **/*.test.tsx) applies conventions across the entire codebase. Directory-level CLAUDE.md cannot do this because it’s directory-bound, not glob-bound.
Plan mode vs. direct execution:
- Plan mode: multi-file migrations, architectural decisions, monolith restructuring
- Direct execution: single-file bug fixes, clear-scope tasks
Key flags and concepts: context: fork in skill frontmatter (isolates verbose output), -p flag for non-interactive CI/CD pipelines, and the value of running an independent review instance (catches more than self-review in the same session).
Domain 4: Prompt Engineering & Structured Output (20%)
The two-word rule for this domain: be explicit. Vague constraints (“be conservative,” “only report high-confidence findings”) do not work. Concrete examples of what to include vs. skip, with severity-level definitions, do work.
Few-shot examples are the highest-leverage technique tested. 2–4 examples showing ambiguous-case handling, with reasoning for why one action was chosen over alternatives, outperform any amount of instruction prose.
tool_use with JSON schemas eliminates syntax errors but not semantic errors. Schema design details that matter: nullable fields when source data may be absent (prevents fabricated values), “unclear” enum values for ambiguous cases, and “other” + detail strings for catch-all categories.
Message Batches API tradeoffs:
- 50% cost savings, up to 24-hour processing time, no latency SLA, no multi-turn tool calling
- Use for overnight batch jobs; use synchronous API for blocking pre-merge checks
Domain 5: Context Management & Reliability (15%)
Smallest weight, but failures here cascade across every domain.
Progressive summarization kills transactional data. The fix: a persistent “case facts” block with extracted amounts, dates, and IDs that is never summarized and is included in every prompt.
“Lost in the middle” effect: models miss findings buried in long inputs. Place key summaries at the beginning of the context, not the middle.
Valid escalation triggers:
- Customer explicitly requests a human — honor immediately
- Policy gap the agent cannot resolve
- Agent cannot make progress
Invalid escalation triggers the exam will tempt you with: sentiment analysis and self-reported confidence scores. Both are unreliable.
Error propagation that works: structured context including failure type, attempted query, partial results, and alternatives tried. Anti-patterns: silently suppressing errors or aborting entire workflows on a single failure.
RDCO Mapping
| Domain | RDCO Status |
|---|---|
| Agentic Architecture | Actively running — channels agent on Mac Mini, LaunchAgent + tmux, daily restart. Coordinator/subagent pattern used informally. |
| MCP Integration | Multiple MCP servers running (Discord, iMessage, qmd, xmcp, Canva, etc.). Project-level .mcp.json configured. |
| Claude Code Config | SOUL.md + CLAUDE.md hierarchy in place. Skills in ~/.claude/skills/. -p flag used for CI contexts. |
| Prompt Engineering | Applied in skills and vault processing, but less systematically tested for few-shot patterns. |
| Context Management | Awareness is there (compaction at 60%, context: fork), but no formal persistent “case facts” pattern yet. |
Biggest gaps relative to the exam: formal stop_reason handling in multi-agent loops, programmatic hook-based tool ordering for financial/security paths, and systematic few-shot examples in structured output workflows.
Recommended Learning Resources (from the article)
- Agent SDK Overview — agentic loop mechanics, subagent patterns
- Building Agents with the Claude Agent SDK — Anthropic’s own best practices on hooks, orchestration, sessions
- Agent SDK Python repo + examples —
hooks,custom_tools,fork_session - MCP Integration for Claude Code — server scoping, env var expansion, project vs. user config
- Anthropic Prompt Engineering docs — few-shot patterns, structured output
- Anthropic API Tool Use documentation —
tool_use,tool_choice, JSON schema enforcement - Anthropic Academy courses: Building with the Claude API, Introduction to MCP, Claude Code in Action, Claude 101
Connected Vault Docs
- 06-reference/2026-04-06-claude-agent-sdk-guide — Nader Dabit’s SDK walkthrough
- 06-reference/2026-04-07-claude-code-architecture-teardown — Rohit’s reverse-engineering of Claude Code internals
- 06-reference/2026-04-04-claude-code-best-practices — Sankalp’s practitioner guide
- 06-reference/2026-04-04-anthropic-skills-internally — How Anthropic structures skills internally
- 01-projects/phdata/anthropic-certification-study-guide — RDCO study plan built from these domains