“Background agents are here. Your orchestration isn’t ready.” — Dan Farrelly
Why this is in the vault
Founder shared 2026-05-08 ~16:58 ET. The thesis is the most explicit articulation of the harness-not-framework argument we’ve seen — directly validates RDCO’s existing architecture (Claude Code + skill-as-harness + autonomous loop). Dan’s 3-layer model (orchestration stable / agent fluid / model volatile) maps cleanly onto Ray’s architecture. The piece earns vault placement on thesis quality alone, even with the seller-bias caveat below.
⚠️ Sponsorship
Author is Dan Farrelly, CTO + co-founder of Inngest — a durable-orchestration platform for agents and workflows. The essay’s conclusion (“you need durable orchestration as a layer”) is structurally the pitch for his product. The thesis is independently strong, but the read should treat Inngest-the-buy-decision separately from harness-the-architecture-principle. RDCO already has the orchestration layer covered locally (Claude Code + cron + /loop); we are not Inngest customers.
The core argument
Every 6 months the “right” way to build an AI agent changes (RAG → vector DBs → ReAct → virtual memory → bigger context → prompt chaining → routing → orchestrator-workers → context engineering → browser → MCP → specialized sub-agents → generic agents → CLIs → sandboxes → software factories → context syncing). If you coupled infrastructure to any one of these patterns, you’ve already rebuilt at least twice.
The layer that doesn’t change: durable orchestration. Steps, events, state, retries, observability. Every pattern listed above runs on the same primitives. Get this layer right and changing agent patterns is easier. Get it wrong and every pattern shift is a rewrite.
Key frameworks
The framework trap
Agent frameworks aren’t libraries — they’re bets on which agent pattern wins. When the pattern shifts, you don’t refactor; you rewrite. LangGraph encodes graph-based control flow. CrewAI encodes role-based agents. AutoGen encodes conversational multi-agent. Each is optimized for one view of how agents should work, and each becomes a liability when that view changes. (LangChain has already moved on to “deep agents,” AutoGen is in maintenance mode as Microsoft shifted to Microsoft Agent Framework. Case in point.)
Dan invokes Anthropic’s agent-patterns guide — “incorrect assumptions about what’s under the hood are a common source of customer error” — to argue against framework adoption. The remedy: abstract the primitives (steps, retries, state) but NOT the topology.
The 5 stable primitives
- Durable steps — work checkpoints so an error mid-loop doesn’t lose 40 minutes of progress
- Persistent external state — survives process crashes and deployments
- Parallel work coordination — fan-out/fan-in, parallel tool calls, sub-agent delegation
- Event-driven control flow — pause and wait for a signal (HITL, cancellation, webhook) without holding a connection open
- Structured execution observability — every step and decision inspectable, structured not just logs
Compose these into whatever pattern is current. Recompose when the pattern changes. ReAct loops, planning agents, multi-agent delegation all reduce to the same step.run() and step.invoke() calls underneath.
The 3-layer architecture (load-bearing for RDCO mapping)
- Orchestration (stable, multi-year decision): durable execution, step primitives, event system, state management, observability, scheduling. Doesn’t change when agent patterns change.
- Agent (fluid, 3-6 month rewrite cadence): how you structure LLM calls, tool use, reasoning, delegation. Changes every 3-6 months. Should be easy to change because it’s just application code running on durable primitives.
- Model (volatile, monthly): which LLM you call, which API, which provider. Should be a single-line change, not an architecture change.
Sandbox vs orchestration are different layers
Sandboxes (Daytona, e2b, etc.) operate at the compute layer — “where does the agent run?” Some pause and resume full VM state, but that’s a runtime snapshot, not a workflow snapshot. They can’t tell you which steps completed, what they returned, or where to resume without re-executing successful work.
When Claude Code or OpenCode runs the harness inside the sandbox itself, the harness state lives in the sandbox filesystem and the sandbox’s VM snapshots become the durability layer — “it turns the sandbox provider into the ‘orchestration’ provider by accident. Actual agent orchestration is now split across multiple layers with mixed levels of observability and durability.”
Two layers are complementary; conflating them is the mistake. Orchestration should sit ABOVE sandboxes, managing their lifecycle and retaining state.
The composability argument
Today’s patterns aren’t the final patterns. New model capabilities will create new architectures we can’t predict. With composable primitives (step.run(), step.invoke(), step.waitForEvent(), step.sleep()), new patterns are new compositions, not new infrastructure. Frameworks struggle here because they encode fixed topology.
Dan also argues — and this is the underrated point — that teams with strong orchestration + observability iterate faster: “The composability gap is really an observability problem. You can’t recompose what you can’t see.”
Background agent gap
The next pattern shift: synchronous chat agents → asynchronous background agents. Background agents need:
- Long-running execution with crash recovery — 45-min agent runs can’t live in 5-min-timeout Lambdas or in-memory on a single process
- Multi-step observability — when a 30-minute background agent produces a bad result, you need every step trace
- Event-driven control flow — pause and wait for external input without blocking a thread
- Lifecycle controls — status, cancellation, scheduling, inspection. Either adopt a layer that gives you this OR build a fragile version that needs maintenance.
Mapping against Ray Data Co (load-bearing)
Strong validation for the existing architecture. The 3-layer model maps directly onto Ray:
| Dan’s layer | RDCO instantiation | Stability |
|---|---|---|
| Orchestration (stable) | Claude Code + /loop + cron-create + skill ecosystem + autonomous loop + /check-board + /process-* watch loops | Multi-year |
| Agent (fluid) | Individual skills (/process-newsletter, /design-critic, /build-landing-page, etc.) + their internal logic | 3-6 months |
| Model (volatile) | Anthropic Opus 4.7 / Sonnet 4.6 / Haiku 4.5 — swappable per-skill | Monthly |
The /loop dynamic-mode pattern + ScheduleWakeup + CronCreate are RDCO’s durable-step primitives. Each cron-fired skill is a checkpoint. The fact that the autonomous loop survives session compaction (per CLAUDE.md hard rule #4 + working-context.md) IS the persistent external state. Sub-agent fan-out via the Agent tool IS the parallel work coordination. The Monitor tool with <task-notification> events IS event-driven control flow.
Ray Data Co’s architecture has been on this thesis for months without naming it.
The /improve task validates this principle in real time
Today’s session produced an /improve task: extract /build-website-discovery from the SC v3 fresh-build interview as a composable upstream skill that feeds /build-landing-page. That’s exactly Dan’s principle — abstract the primitives (discovery, build, critic), don’t abstract the topology. The cc-wrapped reference architecture the founder cited (utility skill → workflow command → dispatch) is Dan’s “compose primitives” thesis applied at the skill layer.
Where the article complicates RDCO’s stack
- Sandbox-vs-orchestration conflation warning applies to us. Ray’s “harness state lives in Claude Code” model means the Claude Code session IS the orchestration layer. That works while sessions are stable; it’s fragile if Anthropic changes Claude Code’s session-management semantics. Worth holding as a known fragility.
- “Background agents need crash recovery” is currently weak for RDCO. /loop + cron survive between fires but a single multi-hour agent run doesn’t have crash recovery; if the parent context dies mid-build, we lose state. Mitigation: subagents return summary lines + write artifacts to vault before terminating, so state persists. Full crash recovery would need explicit checkpointing.
- Lifecycle controls (status/cancellation/inspection) are partial — TaskList + TaskStop work for shells/agents in-session but not across sessions. Acceptable for the autonomous-COO use case; would not scale to a multi-tenant production system.
What this validates retroactively
The Apr 2026 cross-check on harness-thesis (9 sources converged) gets a 10th data point. The Cobus Greyling weights→context→harness piece, Harrison Chase memory-as-moat, Jonathan Natkins data-layer-does-the-work, the arxiv 2604.08224 externalization paper — Dan’s piece is the most explicit synthesis of why frameworks die and primitives compose.
Notable quotes (≤15 words each, in quotation marks)
- “Agent frameworks aren’t libraries. They’re bets on which agent pattern wins.”
- “Abstract the primitives. Don’t abstract the topology.”
- “The composability gap is really an observability problem.”
- “Background agents aren’t coming. They’re here.”
- “Today’s patterns are not the final patterns.”
Open follow-ups
- Audit RDCO’s autonomous-loop crash-recovery posture. Currently relies on subagent return + vault writes; would benefit from explicit per-skill checkpoint primitives if the loop ever needs to scale to multi-hour single-task runs.
- Consider Dan’s
step.run() / step.invoke() / step.waitForEvent() / step.sleep()primitive set as a vocabulary for future skill design — even if we don’t adopt Inngest, the primitive vocabulary is sharper than ad-hoc skill-by-skill design. - Watch Dan / Inngest output for further thesis development — he’s a sharp voice on this specific layer.
Related
- 06-reference/2026-04-15-thariq-claude-code-session-management-1m-context — Anthropic’s context-rot guidance, complementary
- 06-reference/2026-05-08-thariq-unreasonable-effectiveness-html — Thariq’s HTML-as-output piece from the same week, different layer
- 06-reference/2026-04-12-cobus-greyling-weights-context-harness — Cobus Greyling’s weights→context→harness language shift
- 06-reference/2026-04-12-harrison-chase-your-harness-your-memory — Harrison Chase memory-as-moat
- 06-reference/2026-04-12-arxiv-2604-08224-externalization-llm-agents — academic validation of harness thesis
- ~/.claude/skills/loop/SKILL.md — RDCO’s /loop dynamic-mode + cron-fixed-interval implementation
- 01-projects/skill-improvements/ — /improve queue including /build-website-discovery (today’s task)
Source caveat
Article body retrieved via xmcp getPostsById with tweet.fields: ["article", ...] + expansions: ["article.cover_media", "article.media_entities"] — same fetch path that worked for Thariq’s piece earlier today. The X article URL itself was auth-gated.