04-tooling

fat cat council architecture proposal

Thu Apr 30 2026 20:00:00 GMT-0400 (Eastern Daylight Time) ·tooling ·status: proposal
agentic-architecturefat-catfelixcloudflare-agentsmulti-providerl5-trajectory

Fat Cat Council Architecture — Proposal

Founder’s vision

A council of board-member-type agents (“Fat Cats”) that reviews, approves, and pushes back on operating decisions made by Ray (the COO agent). Five Fat Cats vote independently, output is consolidated, and founder + Ray retain ultimate operating say. The council is on-demand — Ray (or founder) calls a “weekly board meeting” rather than running it always-on. Diversity is a feature: each Fat Cat is powered by a different model provider so the council surfaces genuinely different reasoning, not five Claudes in trench coats.

Felix is a separate persona — IT support specialist, originally pitched as an on-device Mac mini agent in 2026-04-19-agentic-team-architecture. Founder is now reconsidering whether Felix should also live remote (less complicated than another tmux + watchdog + plist on the mini). This proposal treats Felix and the Fat Cat council as the same architectural problem (auxiliary persona spin-up) and proposes one substrate that handles both.

The founder’s hardware musings — Cloudflare workers + R2 (he said “R1”) + worker images with Claude Code pre-flashed — point at the right substrate. This proposal sharpens those into one specific recommendation.

One Cloudflare Worker per persona, built on the Cloudflare Agents SDK (Durable Object substrate), routed through the Vercel AI Gateway for multi-provider model access. R2 is the shared corpus (vault snapshot + decision specs); Durable Object SQLite is per-Fat-Cat working memory. Ray on the Mac mini invokes the council via authenticated HTTP fetch through Cloudflare Access (service token).

In one sentence: Fat Cats = Agent subclasses behind Cloudflare Access, using Vercel AI Gateway for provider diversity, with R2 for shared context and DO SQLite for per-cat memory.

Architecture details

flowchart TB
    subgraph MacMini["Mac mini (always-on)"]
        Ray["Ray COO agent (Claude Code in tmux)"]
    end

    subgraph CFEdge["Cloudflare Edge"]
        Access["Cloudflare Access<br/>(service token JWT)"]
        Router["board-router Worker<br/>fan-out + consolidation"]
        subgraph Council["Fat Cat Council (5x Agent DOs)"]
            FC1["FatCatAnthropic DO<br/>state: SQLite"]
            FC2["FatCatOpenAI DO<br/>state: SQLite"]
            FC3["FatCatGemini DO<br/>state: SQLite"]
            FC4["FatCatGrok DO<br/>state: SQLite"]
            FC5["FatCatOpen DO<br/>state: SQLite"]
        end
        Felix["Felix DO<br/>(same substrate, different posture)"]
        R2["R2: shared corpus<br/>vault snapshot, decision spec, RDCO principles"]
    end

    subgraph Providers["Model providers"]
        Gateway["Vercel AI Gateway<br/>(unified provider/model routing)"]
        Anthropic["Anthropic (claude-opus-4.7)"]
        OpenAI["OpenAI (gpt-5.4)"]
        Google["Google (gemini-3.x)"]
        XAI["xAI (grok-4)"]
        Open["Workers AI<br/>(llama-4 / qwen-3)"]
    end

    Ray -->|HTTPS POST<br/>+ CF-Access-Client-Id/Secret| Access
    Access --> Router
    Router -->|fan-out: getAgentByName| FC1 & FC2 & FC3 & FC4 & FC5
    Router -.callable.-> Felix
    FC1 & FC2 & FC3 & FC4 & FC5 -->|read context pack| R2
    FC1 & FC2 & FC3 & FC4 & FC5 -->|generateText| Gateway
    Gateway --> Anthropic & OpenAI & Google & XAI & Open
    FC1 & FC2 & FC3 & FC4 & FC5 -->|return vote + reasoning| Router
    Router -->|consolidated brief| Ray

Component responsibilities

How a meeting actually runs

  1. Ray (Mac mini) prepares a decision spec (markdown: question, options, founder’s stated lean, vault refs). Uploads to R2 as meetings/<date>-<topic>/spec.md.
  2. Ray POSTs to https://council.raydata.co/meet with the spec key + which cats to seat (default: all 5) + meeting type (review / approval / brainstorm).
  3. board-router validates the Access JWT, loads the spec + the current corpus snapshot key, fans out to the 5 DOs in parallel. Each fan-out is a single getAgentByName(...).review(spec, corpusKey) call.
  4. Each FatCat DO loads its persona, reads the context pack from R2, calls its assigned provider via the gateway. Returns within ~30s-90s.
  5. board-router consolidates: tallies votes, extracts dissent quotes, surfaces redlines. Writes meetings/<date>-<topic>/transcript.md to R2 and returns the consolidated brief to Ray.
  6. Ray relays the brief to founder (or files it to the vault, depending on meeting type). Founder makes the call.

Multi-provider strategy

Five seats, one model each. Why these:

SeatModel (mid-2026)Why
Fat Cat Anthropicanthropic/claude-opus-4.7Strongest long-context reasoning. Anchor seat — closest to Ray’s own model, used as sanity floor.
Fat Cat OpenAIopenai/gpt-5.4Different RLHF lineage, different failure modes. Often catches things Claude misses on operational/legal nuance.
Fat Cat Geminigoogle/gemini-3.x (whatever’s current via gateway)Massive context window historically; tends to be more literal, useful as a “did we read the spec” check.
Fat Cat Grokxai/grok-4 (or current)Outsider voice, different training corpus posture. Good for steelmanning unconventional positions; bad-take risk priced in by the council vote.
Fat Cat Openworkers-ai/llama-4 or qwen-3 via Cloudflare AI GatewayCost floor + bias check. If 4 frontier models agree but the open model disagrees, that’s a signal worth examining (alignment monoculture detection).

Routing through Vercel AI Gateway. Single API key (OIDC-managed, no rotation), unified provider/model strings, automatic failover if a provider is down. Same gateway pattern RDCO already uses for surfaces. Cloudflare AI Gateway is a viable alternative (native to the Workers runtime) and we may want to layer it for caching + observability — defer that decision to Phase 2.

Failure mode: if a provider is down, the gateway returns an error; that seat votes defer rather than blocking the whole meeting. Quorum threshold is 3/5 voting (not deferring) for the meeting to count.

Cost estimate

Assumptions: weekly meeting cadence, ~50K input tokens / 5K output tokens per Fat Cat per meeting, 4 weeks/month.

Per meeting (5 cats, single round):

ComponentCalcCost
Anthropic Opus 4.750K in @ $15/M + 5K out @ $75/M~$1.13
OpenAI GPT-5.4 (assume Opus-class pricing)50K in @ $10/M + 5K out @ $40/M~$0.70
Gemini 3.x Pro (assume mid-tier)50K in @ $3/M + 5K out @ $15/M~$0.23
Grok 4 (assume mid-tier)50K in @ $5/M + 5K out @ $25/M~$0.38
Workers AI Llama-4 (open)included in Workers AI plan~$0.05
Per-meeting total~$2.50

Monthly steady state (4 meetings/month, council only): ~$10-15.

Cloudflare infra (per month):

Phase 1 cost (single Fat Cat, single meeting): <$1 to prove the pattern.

All-in monthly at steady state: ~$15-20 for the council. If we add ad-hoc meetings (4-8 extra per month for bigger decisions), call it $30-40/mo. Cheap relative to RDCO’s existing tool spend.

Felix is essentially free incremental cost — same DO substrate, very low query volume (only fires on watchdog pages).

Implementation roadmap

Phase 1 — Single Fat Cat (1-2 evenings). One Worker, one DO class (FatCatAnthropic), no router, no fan-out, no Access. Ray invokes via local fetch with a shared secret in 1Password. Decision spec + context pack passed inline (no R2 yet). Goal: prove a remote Agent SDK persona can review a real RDCO decision and return a useful brief. Test decision: replay a recent founder choice (e.g., “should we run paid ads on Squarely”) and compare the cat’s review to what actually happened.

Phase 2 — Add R2 + Cloudflare Access (1 evening). Move corpus + decision specs to R2. Wire Cloudflare Access service token. Add the board-router Worker as a thin proxy (still single cat). Goal: clean separation between transport, auth, and reasoning; Ray invocation is one HTTPS call.

Phase 3 — Council of 5 (1-2 evenings). Add the four other FatCat<Provider> DO classes. Wire Vercel AI Gateway. Implement fan-out + consolidation in board-router. Personas sharpened per seat. Ship transcript-back-to-R2. Goal: weekly board meeting cadence becomes a real ritual.

Phase 4 — Felix on the same substrate (1 evening). Add Felix DO. Wire Discord webhook to invoke Felix instead of running Felix in a tmux pane on the Mac mini. Retire the Mac-mini-Felix plan from 2026-04-19-agentic-team-architecture. Goal: simpler operational footprint, one substrate for all auxiliary personas.

Phase 5 (optional) — Vectorize-backed retrieval. Index the vault into Vectorize so each cat retrieves only relevant slices. Reduces per-meeting token cost ~60%, sharpens responses. Goal: scale to daily / per-decision invocation cadence without cost concern.

Total elapsed: ~5-7 evenings of focused work to get to Phase 4. Phase 1 alone proves whether this pattern is worth the rest.

What this enables (L5-trajectory)

Mapping to the six L5 markers from Miura-Ko:

This is the unhobbling work the founder said he prioritizes (per project_l5_north_star_strategic_direction.md) — increasing toolset and visibility, not operating bets harder.

What’s deferred

Alternatives considered + rejected

AlternativeRejected because
All 5 cats on the Mac mini (extra tmux panes + Claude Code instances)Doesn’t deliver model diversity (all Claude). Ties council uptime to Mac mini uptime. Heavier operational footprint. Doesn’t unhobble — it just multiplies the surface that hangs.
Pure Cloudflare Workers + Workers AI (no Agents SDK)Workers AI alone limits provider diversity (Workers AI is mostly open models). Lose the per-cat persistent state and SQLite. Lose @callable ergonomics. Have to hand-roll the orchestration the Agents SDK gives for free.
Vercel Sandbox (Firecracker microVMs running Claude Code or Codex)Heavier than needed. Sandbox is for executing untrusted code per agent — Fat Cats don’t run code, they reason over text + return a vote. Sandbox boot time + cost is wrong shape for “5 parallel reasoning calls.”
Self-hosted on a small VPSNew ops surface (instances, OS patches, secrets, monitoring) for no architectural win over Cloudflare. Cloudflare DOs already give us per-cat persistence + global edge + auth + zero ops.
Run the council inline in Ray’s process (5 sub-agent fan-outs in Claude Code)Loses provider diversity (all Claude). Loses the on-demand-but-stateful-between-meetings property. And conflates the supervisor with the supervised — defeats the point of a board.

Open questions for founder

  1. Provider diversity vs cost-control. Worth ~$20/mo for genuine cross-provider diversity? Or do we accept “5 Claudes with different personas” at near-zero incremental cost (cheaper, single-vendor, less interesting)? My read: pay for diversity — it’s the whole point.
  2. Council seats — agree with the 5 chosen? Anthropic / OpenAI / Gemini / Grok / open-model. Specifically: is Grok worth a seat (outsider value) or is it bad-take risk that pollutes the brief (drop in favor of e.g. DeepSeek or Mistral)?
  3. Felix posture. Confirm: move Felix to Cloudflare and retire the mini-pane Felix plan? Or run Felix in both places (mini for fast local diagnostics, Cloudflare as backup)? My lean: just Cloudflare — simpler.
  4. Auto-invocation thresholds. Should the council auto-fire on any Critical Component change, any spend >$X, any new bet decision? Or strictly founder-/Ray-initiated for now? My lean: Phase 1-3 manual, Phase 4 add auto-fire with tight thresholds.
  5. Service-token vs OAuth for Ray’s invocation. Service token is simpler (1Password-stored secret). OAuth adds friction but is the right pattern if we ever want founder-side direct invocation. My lean: service token now, revisit at Phase 4.

Appendix: relationship to existing v1 multi-agent architecture

2026-04-19-agentic-team-architecture designed a 3-agent on-device team (Ray + Felix + Felix Jr) for self-recovery — keeping Ray unstuck without founder in the loop. That problem is real and orthogonal: it’s about liveness of the COO process, not about quality of the COO’s decisions.

This proposal targets a different problem: decision quality + multi-perspective review. The two architectures coexist:

Net change vs the v1 doc: Felix moves from on-device to Cloudflare. Felix Jr is dropped (Cloudflare is the supervisor floor). The bash watchdog + heartbeat for Ray remains as planned.