06-reference

dan farrelly background agents orchestration

Thu May 07 2026 20:00:00 GMT-0400 (Eastern Daylight Time) ·reference ·source: X (Dan Farrelly @djfarrelly long-form article) ·by Dan Farrelly
agent-architecturedurable-orchestrationharness-thesisframeworks-vs-primitivesbackground-agentsinngestframework-trapsandbox-vs-orchestration

“Background agents are here. Your orchestration isn’t ready.” — Dan Farrelly

Why this is in the vault

Founder shared 2026-05-08 ~16:58 ET. The thesis is the most explicit articulation of the harness-not-framework argument we’ve seen — directly validates RDCO’s existing architecture (Claude Code + skill-as-harness + autonomous loop). Dan’s 3-layer model (orchestration stable / agent fluid / model volatile) maps cleanly onto Ray’s architecture. The piece earns vault placement on thesis quality alone, even with the seller-bias caveat below.

⚠️ Sponsorship

Author is Dan Farrelly, CTO + co-founder of Inngest — a durable-orchestration platform for agents and workflows. The essay’s conclusion (“you need durable orchestration as a layer”) is structurally the pitch for his product. The thesis is independently strong, but the read should treat Inngest-the-buy-decision separately from harness-the-architecture-principle. RDCO already has the orchestration layer covered locally (Claude Code + cron + /loop); we are not Inngest customers.

The core argument

Every 6 months the “right” way to build an AI agent changes (RAG → vector DBs → ReAct → virtual memory → bigger context → prompt chaining → routing → orchestrator-workers → context engineering → browser → MCP → specialized sub-agents → generic agents → CLIs → sandboxes → software factories → context syncing). If you coupled infrastructure to any one of these patterns, you’ve already rebuilt at least twice.

The layer that doesn’t change: durable orchestration. Steps, events, state, retries, observability. Every pattern listed above runs on the same primitives. Get this layer right and changing agent patterns is easier. Get it wrong and every pattern shift is a rewrite.

Key frameworks

The framework trap

Agent frameworks aren’t libraries — they’re bets on which agent pattern wins. When the pattern shifts, you don’t refactor; you rewrite. LangGraph encodes graph-based control flow. CrewAI encodes role-based agents. AutoGen encodes conversational multi-agent. Each is optimized for one view of how agents should work, and each becomes a liability when that view changes. (LangChain has already moved on to “deep agents,” AutoGen is in maintenance mode as Microsoft shifted to Microsoft Agent Framework. Case in point.)

Dan invokes Anthropic’s agent-patterns guide — “incorrect assumptions about what’s under the hood are a common source of customer error” — to argue against framework adoption. The remedy: abstract the primitives (steps, retries, state) but NOT the topology.

The 5 stable primitives

  1. Durable steps — work checkpoints so an error mid-loop doesn’t lose 40 minutes of progress
  2. Persistent external state — survives process crashes and deployments
  3. Parallel work coordination — fan-out/fan-in, parallel tool calls, sub-agent delegation
  4. Event-driven control flow — pause and wait for a signal (HITL, cancellation, webhook) without holding a connection open
  5. Structured execution observability — every step and decision inspectable, structured not just logs

Compose these into whatever pattern is current. Recompose when the pattern changes. ReAct loops, planning agents, multi-agent delegation all reduce to the same step.run() and step.invoke() calls underneath.

The 3-layer architecture (load-bearing for RDCO mapping)

Sandbox vs orchestration are different layers

Sandboxes (Daytona, e2b, etc.) operate at the compute layer — “where does the agent run?” Some pause and resume full VM state, but that’s a runtime snapshot, not a workflow snapshot. They can’t tell you which steps completed, what they returned, or where to resume without re-executing successful work.

When Claude Code or OpenCode runs the harness inside the sandbox itself, the harness state lives in the sandbox filesystem and the sandbox’s VM snapshots become the durability layer — “it turns the sandbox provider into the ‘orchestration’ provider by accident. Actual agent orchestration is now split across multiple layers with mixed levels of observability and durability.”

Two layers are complementary; conflating them is the mistake. Orchestration should sit ABOVE sandboxes, managing their lifecycle and retaining state.

The composability argument

Today’s patterns aren’t the final patterns. New model capabilities will create new architectures we can’t predict. With composable primitives (step.run(), step.invoke(), step.waitForEvent(), step.sleep()), new patterns are new compositions, not new infrastructure. Frameworks struggle here because they encode fixed topology.

Dan also argues — and this is the underrated point — that teams with strong orchestration + observability iterate faster: “The composability gap is really an observability problem. You can’t recompose what you can’t see.”

Background agent gap

The next pattern shift: synchronous chat agents → asynchronous background agents. Background agents need:

  1. Long-running execution with crash recovery — 45-min agent runs can’t live in 5-min-timeout Lambdas or in-memory on a single process
  2. Multi-step observability — when a 30-minute background agent produces a bad result, you need every step trace
  3. Event-driven control flow — pause and wait for external input without blocking a thread
  4. Lifecycle controls — status, cancellation, scheduling, inspection. Either adopt a layer that gives you this OR build a fragile version that needs maintenance.

Mapping against Ray Data Co (load-bearing)

Strong validation for the existing architecture. The 3-layer model maps directly onto Ray:

Dan’s layerRDCO instantiationStability
Orchestration (stable)Claude Code + /loop + cron-create + skill ecosystem + autonomous loop + /check-board + /process-* watch loopsMulti-year
Agent (fluid)Individual skills (/process-newsletter, /design-critic, /build-landing-page, etc.) + their internal logic3-6 months
Model (volatile)Anthropic Opus 4.7 / Sonnet 4.6 / Haiku 4.5 — swappable per-skillMonthly

The /loop dynamic-mode pattern + ScheduleWakeup + CronCreate are RDCO’s durable-step primitives. Each cron-fired skill is a checkpoint. The fact that the autonomous loop survives session compaction (per CLAUDE.md hard rule #4 + working-context.md) IS the persistent external state. Sub-agent fan-out via the Agent tool IS the parallel work coordination. The Monitor tool with <task-notification> events IS event-driven control flow.

Ray Data Co’s architecture has been on this thesis for months without naming it.

The /improve task validates this principle in real time

Today’s session produced an /improve task: extract /build-website-discovery from the SC v3 fresh-build interview as a composable upstream skill that feeds /build-landing-page. That’s exactly Dan’s principle — abstract the primitives (discovery, build, critic), don’t abstract the topology. The cc-wrapped reference architecture the founder cited (utility skill → workflow command → dispatch) is Dan’s “compose primitives” thesis applied at the skill layer.

Where the article complicates RDCO’s stack

What this validates retroactively

The Apr 2026 cross-check on harness-thesis (9 sources converged) gets a 10th data point. The Cobus Greyling weights→context→harness piece, Harrison Chase memory-as-moat, Jonathan Natkins data-layer-does-the-work, the arxiv 2604.08224 externalization paper — Dan’s piece is the most explicit synthesis of why frameworks die and primitives compose.

Notable quotes (≤15 words each, in quotation marks)

Open follow-ups

Source caveat

Article body retrieved via xmcp getPostsById with tweet.fields: ["article", ...] + expansions: ["article.cover_media", "article.media_entities"] — same fetch path that worked for Thariq’s piece earlier today. The X article URL itself was auth-gated.