IndyDevDan — BIG 3 SUPER AGENT: Gemini 2.5 Computer Use, OpenAI Realtime API, Claude Code

Why this is in the vault

A 32-minute live demo of Dan running a three-vendor agent stack — OpenAI Realtime (voice orchestrator named “Ada”), Claude Code (two builder agents named “Sony” and “Blink”), and Gemini 2.5 Computer Use (browser validation) — to ship a Sora-2 video generator end to end while Dan barely types. Most of his videos describe patterns; this one shows the thing actually running, including the failure modes (frontend collisions between Sony and Blink, an in-flight crash that kills the orchestrator, the recovery flow). It belongs in the vault because it’s the most operationally honest “compose-don’t-pick” demo Dan has filed: he names the loyalty pitch (“the tech industry wants you to pick one model, one provider, one tool — for the engineer that’s a losing game”) and then proves what the alternative concretely costs and earns. It also sponsors his Tactical Agentic Coding course in the back half — flagged below — which is meaningful because the demo IS the curriculum’s proof-of-concept and not a separable insert.

Core argument

“Think in ands, not ors” is a capital-allocation claim, not just a technical one. Single-vendor loyalty optimizes for the vendor’s business model, not the engineer’s maximum capability. The pattern Dan demonstrates: voice orchestrator (OpenAI) commands code agents (Anthropic) which validate via browser agents (Google). Each vendor wins exactly the layer where it’s currently SOTA, and Dan owns the seam.
The orchestrator layer must be deliberately thin. Ada exposes only CRUD-against-agents (list, create, command, status). It has no opinion about which agents exist, which models they wrap, or what tools they have. This thinness is what lets the system absorb “whatever new Opus or Gemini 3 ships next” without rewriting orchestration logic. Anti-pattern: building orchestrators that encode model choice or tool catalogs.
Multi-agent observability is not a nice-to-have — it’s the gating condition for scaling compute. “If you can’t see what your agents are doing at scale, you can’t scale your compute. And when you scale compute, you scale impact.” Dan’s UI shows live tool calls, hook firings, and cheap-model summaries of every command. The summaries are the key UX move: dive in only when something looks wrong.
Closed-loop validation belongs to the agent, not the engineer. Blink doesn’t return “done” — it spawns a Gemini 2.5 Computer Use agent that opens the browser, takes screenshots, exercises the UI, and reports back. The engineer reviews the result message, not the work. This is the “increase trust, reduce review” thread from his AGENT THREADS video made concrete.
Boundary-setting between agents is a prompt-engineering failure mode. The demo’s biggest visible failure: Sony (backend) starts editing frontend files because Dan didn’t explicitly partition responsibility in the plan. He has to re-prompt. The lesson: when agents share a codebase, ambiguity in the spec becomes silent merge conflict in production. Specs need explicit “you do not touch X” statements.
Recoverability is a system property, not an agent property. When the entire voice orchestration crashes mid-generation, Dan just relists agents and resumes — because the agents have been logging to disk throughout. State lives in files, not in agent memory. This is the missing principle in most multi-agent demos.
Voice is an input mode, not the system. Dan explicitly demos a text-prompt fallback: same capability, no voice. The lesson is that the system was designed input-agnostic — voice / text / programmatic prompt all hit the same orchestrator surface. Anti-pattern: building voice-first agents where the voice IS the system.
“You, the engineer, are the bottleneck — not the tools.” Dan’s recurring frame: models keep getting better, tools keep getting better, but the engineer’s ability to orchestrate compute is the only variable that matters. Direct causal claim: engineering output ∝ compute deployed.

Mapping against Ray Data Co

Vault already runs a thin orchestrator pattern at the SOP layer (skills/SKILL.md spec + sub-agent fan-out in /process-newsletter and /process-youtube watch). The Sony/Blink CRUD orchestrator is the same pattern at runtime — and the gap RDCO has not closed is named-persistent agents (Ada, Sony, Blink) versus the current ephemeral sub-agent pattern. Worth piloting: a /agents skill that creates/lists/commands persistent agents inside a single Claude Code session, exposing the same CRUD surface Dan demos. ~3 hour build, big leverage on multi-step backfill cycles like this one.
Multi-agent observability is the missing third leg of the autonomous loop. RDCO has the work loop (cron + check-board) and the queueing loop (Notion board), but no live-pulse view of what sub-agents are doing while they run. Dan’s UI is overkill for a single-operator agent like Ray, but a stripped-down ~/.claude/state/agent-pulse.jsonl append log + tail -f-able view of what each spawned sub-agent is currently doing would surface stalls and reduce founder-side anxiety about “is the agent working or hung?” Pairs with 2026-04-19-indydevdan-self-validating-hooks.
The “Sony touched Blink’s files” failure maps directly onto the cycle 9 incident from the 2026-04-19 backfill where the founder’s note flagged inconsistent assessment file structure across sub-agents. Same root cause: insufficient partitioning in the spec. The fix Dan demos — explicit “Sony only writes to backend/, Blink only writes to frontend/” — should be ported to the SKILL.md format: any skill that fans out to N sub-agents must explicitly state the disjoint output paths each sub-agent owns. Worth a one-line addition to /process-youtube and /process-newsletter SKILLs.
“Think in ands, not ors” reinforces the existing harness thesis but adds a vendor-composition angle. The harness-thesis cluster (2026-04-15-thariq-claude-code-session-management-1m-context, 2026-04-12-cobus-greyling-weights-context-harness) argues scaffold matters more than weights. Dan’s argument is the next layer up: vendor-mix matters more than scaffold. Even a great Anthropic-only harness is weaker than a multi-vendor harness that uses each lab where it’s currently best. This is a real Sanity Check angle: “Loyalty is a tax. The harness era is also the polyglot-vendor era.”
Recoverability via disk-resident state is a vault hygiene principle that should be made explicit in skills/. Dan’s crash-then-relist works because every agent logged to disk. RDCO’s check-board cycles already write state mid-flight (working-context.md), but no skill spec currently requires that sub-agents flush partial state before terminating. Worth adding to the global SKILL.md template.
The voice-input demo is not directly applicable — RDCO’s interface is text channels (iMessage/Discord) by design, and the founder has explicitly named voice as a non-priority for now. Skip the voice angle for any Sanity Check derived from this. Note for future: if a voice channel ever opens, Ada’s pattern is the model.

Open follow-ups

Pilot a /agents persistent-named-agent skill. CRUD surface: create, list, command, status, delete. Backed by tmux panes or background bash workers. Test against the next backfill cycle (specialize Sony=transcript-extractor, Blink=assessment-writer, run them in parallel against 5 videos). ~3 hour build.
Add agent-pulse.jsonl append log + tail -f viewer. Every sub-agent spawn writes start/end/current-step lines. Founder can cat ~/.claude/state/agent-pulse.jsonl | tail -20 to see what’s happening live. ~30 min build.
Update /process-youtube and /process-newsletter SKILL specs with explicit output-path partitioning when fanning out. One-line additions: “Sub-agent N writes ONLY to ; never touches .” Prevents the Sony/Blink collision in our own pipelines.
Add “flush partial state before termination” requirement to global SKILL.md template. Recoverability principle. Every long-running sub-agent must checkpoint to a known location.
Sanity Check angle: “Vendor loyalty is a tax in the harness era.” Pair Dan’s compose-don’t-pick frame with 2026-04-19-indydevdan-top-2-percent-plan-2026 (custom-agents) and the harness thesis cluster. Could be a strong issue.
Investigate Gemini 2.5 Computer Use as a Playwright alternative for /build-landing-page review loop. Currently uses Playwright MCP; Gemini’s CUA might be cheaper and more agent-native. Comparison brief, ~1hr.
Tactical Agentic Coding course evaluation deferred — see Sponsorship section below for disclosure context, but not recommending the course purchase here. The vault gets the operational ideas free from his videos.

Sponsorship

The back ~7 minutes of this video (roughly 24:00–31:00) is a paid pitch for Dan’s own course, Tactical Agentic Coding, with an Early Bird deadline of “Wednesday” relative to upload date (2025-10-13, so EB ended ~2025-10-15). He discloses pricing structure, refund policy (full refund within 30 days before lesson 4), and explicitly names the audience exclusion (“not for noobs / vibe coders”). It is a self-sponsored segment, not a third-party advertiser, but per RDCO’s bias-flagging discipline it functions identically: Dan has direct financial incentive in the framings used in the first 24 minutes (the “you, the engineer, are the bottleneck” framing maps cleanly to “buy my course to fix the bottleneck”). The technical content stands on its own — the demo works, the architecture is real, the failure modes are honestly shown — but the conclusion-to-purchase pipeline is steep enough to flag. Treat the technical claims as testable and the framing claims (especially around “compute = impact” causality) as adjacent to a sales pitch.

~/rdco-vault/06-reference/transcripts/2026-04-20-indydevdan-big-3-super-agent-transcript.md — raw transcript
~/rdco-vault/06-reference/2026-04-20-indydevdan-one-agent-to-rule-them-all.md — companion piece, same orchestrator pattern in a single-vendor (all-Claude) variant
~/rdco-vault/06-reference/2026-04-20-indydevdan-agent-threads-boris-cherny.md — threads framework that names what this demo is doing (B-thread + F-thread combination)
~/rdco-vault/06-reference/2026-04-19-indydevdan-top-2-percent-plan-2026.md — earlier filing of the custom-agents/private-evals frame this demo extends
~/rdco-vault/06-reference/2026-04-19-indydevdan-self-validating-hooks.md — hook-based validation that powers Blink’s closed-loop validation step
~/rdco-vault/06-reference/2026-04-15-thariq-claude-code-session-management-1m-context.md — context-rot principle that justifies the thin-orchestrator design (Ada must NOT observe agent logs continuously)

IndyDevDan — BIG 3 SUPER AGENT: Gemini 2.5 Computer Use, OpenAI Realtime API, Claude Code

Why this is in the vault

Core argument

Mapping against Ray Data Co

Open follow-ups

Sponsorship

Related