“The AI loop that’s been rewiring how I think about company design” — Mitohealth founder

Why this is in the vault

Direct external articulation of the agent-native company architecture RDCO has been operating for itself. 4th piece in this week’s thesis-cluster (Turing / Elad Gil / Reiner Pope / Meta Ads CLI). Maps 1-to-1 onto RDCO’s operating loop and provides clean vocabulary (“queryable company,” “5 layers,” “headcount as a feature”) worth adopting in Sanity Check writing. Founder shared via iMessage 2026-04-30 with the question “We’ve been circling this right?” — answer is yes, and harder than circling: we’re running it.

The 5 layers (author’s framing)

Sensors + data — every signal from the outside world. Customer emails, support tickets, cancellations, product events, code changes. “If it’s not captured, it didn’t happen to the company.”
Policy layer — the rules. What the system can do alone, what needs human sign-off, what must be logged. Guardrails that make the loop trustworthy.
Tool layer — the deterministic stuff. SQL, API calls, calendar lookups. Things that live in code, not english. Cites @garrytan: “figuring out what belongs in markdown vs what belongs in code is 90% of the battle.”
Quality gates — safety checks, human review for high-stakes calls. “The escape hatch back into judgment.”
Learning mechanism — the unlock. Monitoring agent watches every query, sees where it fails, writes the fix overnight, opens the merge request, ships it. “The company gets better while you sleep.”

Author’s claim: “Most teams have 1 through 4. Almost nobody is running 5 across every function yet. That’s the next 6 months.”

Author’s operating context: 5 people at @usemitohealth across two cities. Everyone touches code. Revenue per employee “at a level I wouldn’t have believed in my fintech days.” Headcount as a feature, not a bug.

Mapping against Ray Data Co — direct 1-to-1

This is the strongest external-articulation match for the RDCO operating loop we’ve seen.

Layer 1: Sensors + data → the vault

Gmail MCP captures inbound mail (newsletter ingestion, conversations, transactional)
Google Calendar MCP captures the schedule
Notion captures the task board + Research Backlog + Contact Candidates DB
iMessage + Discord MCPs capture two-way comms with founder
/process-newsletter watch + /process-youtube watch + crons capture external signal (newsletters, podcasts, videos, articles)
Founder’s journal entries land in the vault directly
Web fetches (WebFetch / xmcp / qmd) capture ad-hoc research

If something happens to RDCO and isn’t in the vault, it didn’t happen. Layer 1 = solved.

Layer 2: Policy → CLAUDE.md hard rules + skill guardrails

CLAUDE.md hard rule #1: always run date before stating time
CLAUDE.md hard rule #2: channel responses go through the channel’s reply tool
CLAUDE.md hard rule #3: trust the UTC offset on Calendar dateTimes
CLAUDE.md hard rule #4: route long artifacts through subagents (context-rot prevention)
Memory: “Distinguish decision from action” — what Ray executes autonomously vs what surfaces for founder judgment
Memory: “Brief iMessage, link to HQ” — channel-level brevity policy
Memory: “No secrets on disk” — credentials live in 1Password, not env files
The Stripe RAK scoping conversation 2026-04-30 morning was a textbook Layer 2 decision: which API surfaces gets Read, which get Write, which stay None — explicit policy authoring

Layer 3: Tool → skill stack + cron + audit scripts

~/.claude/skills/ — 30+ skills (process-newsletter, check-board, morning-prep, deep-research, /improve, vault-health, sync-contacts, etc.)
~/.claude/scripts/ — deterministic Python tooling (audit-newsletter-outputs.py, graph-ingest.py, vtt-to-text.py, extract-key-frames.py, eval-mine.py, rdco-doctor.py)
~/.claude/scripts/scheduled-jobs.txt — 13 cron schedules
25+ MCP servers wired up (Gmail, Calendar, Notion, Discord, iMessage, Stripe, Cloudflare, Slack, Canva, Figma, qmd, blender, xmcp, ElevenLabs, Monarch, Firebase, Vercel, Playwright, Xcode, computer-use, claude-in-chrome, etc.)

The Garry Tan markdown-vs-code attribution maps EXACTLY onto our skill-vs-script split:

Skills (markdown) = workflow specs that Claude reads and executes (per-newsletter classification, sponsor detection, content triage logic)
Scripts (code) = deterministic checks that don’t need LLM (audit invariants, frame extraction, graph ingest, vtt conversion)

That split is the entire thesis behind the audit script as “RDCO’s deterministic verification layer, the answer to Kingsbury’s verification-layer LLM contamination critique.”

Layer 4: Quality gates → audit scripts + per-charge approval + escape hatches

audit-newsletter-outputs.py: zero LLM calls, deterministic post-condition check on 13 invariants per newsletter file
Link CLI per-charge approval: founder taps approve/deny on every spend, no autonomous Ray purchases
/check-board: never auto-archives Founder-owned tasks, always surfaces “Both”-owned items where founder-side is the blocker
“Distinguish decision from action” memory: explicit gate between reversible Ray-executes and irreversible founder-decides
/improve autonomous: low-risk fixes apply silently, structural changes queue to Notion for founder approval (built-in escape hatch back into judgment)

Layer 5: Learning mechanism → /improve autonomous + /self-review + eval-mine + rdco-doctor

This is the layer the author claims “almost nobody is running yet.” We are.

/self-review —fix runs weekly, appends systemic patterns to ~/rdco-vault/01-projects/self-review/review-log.md
/improve autonomous runs the next morning, reads the review log, applies low-risk fixes silently to skill files, queues structural changes to the Notion board for founder approval
rdco-doctor.py audits skill quality (description overlap, dark skills, missing scripts) every cycle
eval-mine.py mines failure patterns from skill outputs and ranks skills by “fucking-shit / wtf” candidate test cases for next /improve cycle
newsletter audit script catches structural drift in vault filing standards every batch
The 2026-04-20 changelog entries on /process-newsletter and the 2026-04-24 I12 audit softening are concrete proofs of the loop working

The skill files literally improve themselves overnight based on prior performance, with founder approval gates on structural changes. That is Layer 5 running across the entire operating surface.

Where the author’s framing differs from RDCO’s

Scale: he’s running 5 people across two cities; RDCO is 1 founder + 1 agent. RDCO is the smallest possible expression of the architecture, which makes it the strongest demo (if it works at 1, the architecture is real; nothing about the framing requires team-scale).
Domain: Mitohealth is biotech/consumer-health (per “fintech days” reference, founder is post-fintech). RDCO is data engineering + content + agent-deployer advisory. The architecture is domain-agnostic — both work.
Visibility: Mitohealth founder is publishing this on X as a recruiting/positioning post. RDCO has been BUILDING it but not narrating it publicly with this clarity. Sanity Check editorial gap — fill it.

Strategic upshot for RDCO

RDCO is the operating proof of the agent-native company thesis. Not theory. Not consulting talking points. Actual running production loop with 13 crons, 30+ skills, 1300+ vault docs, deterministic audit layer, weekly self-improvement cycle, and per-charge spend governance.

That’s a structurally stronger Sanity Check + advisory hand than anyone selling the framing without running it. The “operator-as-evidence” angle the founder has been building maps directly onto this — RDCO’s daily operating rhythm IS the proof.

The Sanity Check editorial move: write the canonical mid-market version of “what does the agent-native company architecture look like at scale 1?” — using RDCO’s own loop as the worked example. Each layer gets a section with the actual scripts, skills, and decisions named explicitly. Non-derivative re-frame because (a) author writes from biotech/5-person scale, RDCO writes from solo+1-agent scale, (b) the operating proof is concrete and inspectable.

Vocabulary worth adopting

“Queryable company” — name the architecture
“Headcount as a feature, not a bug” — frame the lean-stack pitch
“The middle is what’s compressing” — frame the workforce narrative
“Markdown vs code is 90% of the battle” (Garry Tan) — frame the skill-vs-script split publicly

Open questions worth tracking

Specific Mitohealth founder name and X handle (the screenshot mentioned @usemitohealth but not the personal handle) — worth surfacing for tracked-author candidate list
What specific monitoring agent did Mitohealth build for Layer 5? Is it published or proprietary? Could it be a reference architecture for what /improve autonomous becomes at scale?
The “@garrytan markdown vs code” reference — was this from the YC Spring 2026 batch talk? Is it published anywhere we should ingest?

2026-04-30-jonathan-siddharth-turing-superintelligence-loop — same week, agent-deployer thesis at $1B+ scale; Mitohealth founder is the operator-side proof of what Turing claims to platform
2026-04-30-meta-ads-cli-agent-native-launch — Layer 3 platform-vendor surface for this architecture
2026-04-29-tim-ferriss-elad-gil-ai-frontier-billion-dollar-companies — four-criteria durability test maps onto Layer 5 (learning mechanism = workflow embed + proprietary data, the two strongest criteria)
2026-04-29-dwarkesh-reiner-pope-gpt5-claude-gemini-training — memory-bandwidth wall implication for which Layer 3 architectures are durable
2026-04-29-link-cli-agent-wallet-setup — Layer 4 quality gate primitive (per-charge approval contract)
2026-04-11-garry-tan-thin-harness-fat-skills — the original markdown-vs-code framing source RDCO already filed; this Mitohealth post brings it forward as load-bearing
Notion Research Backlog: Sanity Check candidate “RDCO as the operating proof of the agent-native company thesis at scale 1” (queue from this note)