Ray Architecture — Introspection (Layers, Unhobbling Moments, Composability)
Why this is in the vault
Founder asked 2026-05-10 11:49 ET for introspection on Ray’s capability layers, infrastructure shape, and which skills were “big leap” unhobbling moments in the “baby AGI” arc. Direct input to the Ray-as-a-Starter-Kit product thesis. Defines what’s portable (Layer 1 + skill files + scripts), what’s earned (Layer 7 accumulated rules + vault content + composability graph), and which inflection points were the inflection points. Gives later product decisions a single canonical map to reference instead of re-introspecting.
Mapping against Ray Data Co
- Direct input to Ray-as-a-Starter-Kit candidate bet: defines what ships (Layers 0-6) vs what is earned (Layer 7 + content)
- Names the unhobbling moments so they become teachable. A new operator can be told “ship these 13 skills first in this order” instead of bootstrapping blindly
- Establishes the composability graph as the moat at scale, separate from any single skill being the moat
- Anchors the L5 north-star unhobbling work: every layer gets a checklist of “what’s next to unhobble” rather than vague “make Ray better”
Founder’s framing (verbatim)
“I think it is thin, then bootstrapped. Some skills we can call out at saying they were a big leap in the ‘unhobbling’ process of our ‘baby AGI’. These skills begin to stack on each other pretty quickly though (our composable skills thought) - which the more interplay we introduce the harder it is for someone to get back to the same point.”
The architecture is what justifies that intuition. Below.
The 8 layers
Ray is a stack of 8 layers. Each layer encodes an assumption about what the layer above cannot do alone (Osmani’s harness-engineering frame, applied recursively). Lower layers are mostly off-the-shelf; higher layers are mostly Ray-specific.
Layer 0 — Substrate (NOT Ray)
What runs underneath:
- Anthropic Claude Opus 4.7 (1M context) — the model
- Claude Code — the harness (loop, sandboxes, hooks, MCP plumbing, subagents, plan mode, tool dispatch, slash commands, Skill tool)
- macOS on Mac mini — filesystem, LaunchAgent for cron, fswatch, networking
- iCloud, GitHub, Cloudflare, 1Password, Notion, Gmail, Calendar — external systems Ray reaches via MCP
- ~30 MCP servers — the integration plumbing
Portability: 100%. Anyone can spin this up. ~1 hour of install, no Ray-specific knowledge needed.
Layer 1 — Identity (who Ray is)
~/SOUL.md(76 lines) — values, communication style, decision authority, voice. The “be Ray, not generic Claude Code” instruction.~/CLAUDE.md(53 lines) — 4 hard rules earned from past failures.~/.claude/projects/-Users-ray/memory/MEMORY.md+ ~40 per-fact memory files — accumulated tacit rules, taste calibration, founder preferences.
Portability:
- SOUL.md template is portable (with placeholders for operator role / voice / decision-authority)
- CLAUDE.md hard rules are NOT portable as content (every rule traces to a Ben-specific failure) but ARE portable as PATTERN
- Memory files are NOT portable (per-fact, per-operator)
This is where the personal-fit accumulation lives. Layer 1 is the most expensive layer to reproduce and the most operator-specific.
Layer 2 — Communication (how Ray reaches the operator)
The bidirectional channels:
- iMessage MCP (
mcp__plugin_imessage_imessage) — primary 1:1 channel, default delivery surface - Discord MCP (
mcp__plugin_discord_discord) — multi-user surface, #ops channel for routine updates - Generative-UI return channel —
sms:ray@raydata.co?body=...URL pattern that routes through Messages then iMessage MCP back to Ray session. Validated 2026-05-09. Closes the click-back loop on HTML decision pages. - HQ web surface (
hq.raydata.co) — vault routes, decisions index, bets dashboard. The visible-state surface (Tobi’s “public channel” pattern at solo-founder scale).
Portability: 90%. Channel choice swaps cleanly (Slack instead of Discord, Telegram instead of iMessage if MCPs exist). The generative-UI pattern is a copy-paste recipe. HQ is an Astro template + vault sync script.
Layer 3 — Knowledge (what Ray knows)
~/rdco-vault/— 2,263 markdown files. Structured: 01-projects, 02-sops, 03-contacts, 04-finance, 04-tooling, 05-meetings, 06-reference, 07-source-material, 07-bet-stacks, 07-archive- QMD index — 2,137 docs with lex/vec/hyde semantic search via
mcp__qmd - DuckDB graph (
~/.claude/state/graph.duckdb) — typed-edge knowledge graph (cites, authored-by, published-in, about-topic). Refreshed daily via/graph-reingest. ~/.claude/state/working-context.md— durable scratchpad across compactions, the “what’s in progress right now” file~/.claude/state/<skill>-*.txt— per-skill state checkpoints (founder-energy, youtube-watch-*, sync-contacts-ledger, process-newsletter-watch, etc.) That’s how skills survive across sessions.
Portability:
- Folder STRUCTURE is portable (a 10-line shell script can scaffold)
- CONTENTS are NOT (per-operator)
- QMD + DuckDB tooling are portable
- State-file schema is portable (specific contents are not)
Layer 4 — Observation (how Ray sees the world)
Read-mostly MCPs:
- Gmail, Google Calendar, Google Drive, Notion, Slack, Stripe, Monarch Money, xmcp (Twitter/X), Cloudflare (4 MCPs: Bindings, Builds, Observability, API), Firebase, Heygen, ElevenLabs, Canva, Figma, Mobbin, Blender
- WebSearch, WebFetch
- Browser tools (claude-in-chrome, computer-use)
- Filesystem (the workspace itself)
Portability: 95%. Each MCP server is one config swap. Yahoo Mail instead of Gmail, Linear instead of Notion, etc.
Layer 5 — Action (what Ray can DO)
69 skills organized into capability classes:
| Class | Skills | Purpose |
|---|---|---|
| Ingest | process-newsletter, process-youtube, process-inbox, save-to-bookshelf, sync-contacts, discover-sources | Pull external content into the vault |
| Triage / route | check-board, morning-prep, curiosity, deep-research | Decide what to work on next |
| Make / produce | research-brief, draft-review, build-landing-page, build-project, sanity-check-design, ray-data-co-design, voice-match, paid-ads | Generate finished artifacts |
| Animation / video | ray-mascot-anim, animejs, blender, blender-character, css-animations, gsap, lottie, three, hyperframes (5 variants), heygen-skills, remotion-to-hyperframes, waapi | Visual + motion production |
| Verify / critic | video-critic, design-critic, draft-review, audit-model, vault-health, cross-check, self-review, verify-action | Evaluate outputs (fresh-eyes pattern) |
| Infrastructure / audit | aws-audit, finance-pulse, graph-query, graph-reingest, log-bet-decision, compile-vault, generate-tests, postgrid, stripe-* (3), upgrade-stripe, swift-* (4), xcode-* (5), spm-build-analysis | Operate + introspect the systems |
| Meta | improve, skillify | Self-modification primitives |
| Deploy | cloudflare, squarely-deploy, remix, tailwind | Push to production |
Plus 32 deterministic scripts in ~/.claude/scripts/ (no LLM, just shell/python): audit-newsletter-outputs.py, extract-key-frames.py, vtt-to-text.py, graph-ingest.py, finance-venv, postgrid-api.py, send-voice-message.py, etc. These are the “hooks-as-enforcement” layer (Osmani frame).
Portability:
- All 69 skill markdown files = portable
- All 32 scripts = portable
- BUT: many skills depend on Layer 1 conventions (CLAUDE.md hard rules, SOUL.md voice) and Layer 3 vault structure that the new operator won’t have day one. The skills work generically; the polish requires Layer 1 + 3 to be filled in.
Layer 6 — Discipline / Loops (how Ray operates without being asked)
13 cron loops in ~/.claude/scripts/scheduled-jobs.txt re-armed every fresh session:
| Cadence | Skill | Purpose |
|---|---|---|
| 30m | /process-inbox | Triage anything dropped in 00-inbox |
| 1h | /check-board | Pick up Notion task board work |
| 6h | /process-newsletter watch | Poll Gmail for whitelisted senders |
| 24h | /vault-health | Structural diagnostics |
| 6:30am daily | /morning-prep | Calendar-aware brief to founder iMessage |
| 11:11pm daily | /process-youtube watch | Poll YouTube RSS for tracked channels |
| 1am daily | /deep-research | Dequeue 3 Approved questions, file briefs |
| 1:30am daily | /sync-contacts | Gmail+Calendar touch updates, new-contact triage |
| 3:17am daily | /graph-reingest | Refresh typed knowledge graph |
| 9am daily | check-public-ip-drift.sh | Stripe RAK + allowlist drift |
| Sun 7am weekly | /self-review | Score recent vault entries |
| Mon 7am weekly | /improve autonomous | Apply low-risk fixes from self-review log |
| Tue+Sat 10pm | /curiosity | Surface 5-10 research candidates to Notion |
| First-Sun 8am monthly | /finance-pulse | Personal-finance health check |
The crons are the difference between “an AI assistant you have to invoke” and “a COO that runs without you.” Without Layer 6, Ray is reactive. With Layer 6, Ray is autonomous.
Portability: 100% pattern, ~70% configuration. The cron schedule is portable; specific skill choices depend on which Layer 5 skills the operator wants firing.
Layer 7 — Self-modification / Ratchet (how Ray gets better)
The recursive layer. Each piece reads Ray’s own outputs and patches Ray.
- /improve — weekly Mon 7am. Reads self-review log, applies low-risk fixes silently, queues structural changes to Notion. The discipline of ratcheting.
- /self-review — weekly Sun 7am. Scores recent vault entries against 7 criteria, surfaces systemic patterns.
- audit-newsletter-outputs.py — deterministic Jepsen-style invariant checker. Zero LLM. Runs after every /process-newsletter batch/watch.
- /skillify — meta-skill that creates new skills. The bootstrap loop for capability expansion.
- Memory write discipline — every founder correction, every observed failure, gets a
feedback_*.mdmemory file. The ~40 files in~/.claude/projects/-Users-ray/memory/are the accumulated taste.
Portability: 100% pattern (the loop logic ships), 0% accumulated content (the rules earned are operator-specific).
This is where the moat lives. Other layers are scaffolding. Layer 7 is what makes scaffolding compound.
The 13 unhobbling moments (chronological narrative)
These were the inflection points. Each was a “Ray went from X to Y” moment that unlocked everything downstream. Listed in rough chronological order so the arc is visible.
-
CLAUDE.md hard rule #1 (date check) — fixed time-citation drift. Foundational because every other skill that timestamps anything depends on this. Without it, the whole observation layer was unreliable.
-
CLAUDE.md hard rule #2 (channel responses via reply tool) — without this, founder couldn’t HEAR Ray. Session output is invisible to him. This rule made Ray a communicator, not a soliloquist.
-
Memory file pattern (~/.claude/projects/
/memory/) — durable tacit knowledge across sessions. Eachfeedback_*.mdfile is a ratcheted rule. ~40 files now. Without this layer, every session re-learned the same lessons. -
The /improve cycle — meta-loop. Reads self-review output and patches the skill prompts. The harness modifying itself. Most consequential single skill in the stack.
-
The /skillify meta-skill — creates new skills from a description + an example failure. The capability that creates more capability. Productivity multiplier.
-
/process-newsletter sub-agent fan-out (2026-04-16) — first big “spawn N subagents for parallel work” pattern. Validated the architecture and the context-budget math.
-
CLAUDE.md hard rule #4 (subagent routing for >5KB artifacts) — Thariq Apr 15 2026 Anthropic guidance. Without it, parent context gets blown by every newsletter (30-100KB), every YouTube transcript (60-150KB), every web fetch. With it, parent stays lean across the whole session.
-
Audit script (Jepsen invariants) — deterministic verification, zero LLM contamination. The hooks-as-enforcement layer made concrete. 13 invariants checked after every newsletter batch.
-
/log-bet-decision + 07-bet-stacks/
.yaml — structured decision capture. Decisions became queryable, time-ordered, attributable. Foundation for the bet-dashboard UI. -
Notion task board + Research Backlog DB — autonomous-pickup queue. Without this, Ray waits for founder to dispatch each task. With this, Ray dequeues approved work from a board the founder maintains asynchronously.
-
/deep-research nightly cron — autonomous-research engine. 3 briefs/night, no founder ask. Combined with /curiosity (proposes the questions), this is a closed-loop research pipeline.
-
HQ web surfaces (yesterday, 2026-05-09) — vault-as-routes + /decisions/ index + /bets/ dashboards. Founder can SEE what Ray knows without grep. Made Ray’s state observable from the founder’s primary device.
-
Generative-UI return channel (yesterday, 2026-05-09) —
sms:ray@raydata.co?body=...pattern. HTML decision pages with click-back that routes through Messages → iMessage → Ray session. Closes the structured-input loop. Founder can answer multi-option questions with one tap.
Each of these is roughly 1-3 days of focused work. The full chain took about 9 months to assemble. A new operator with the playbook should be able to compress that to ~6 weeks (per the harness-moat concept doc onboarding sequence).
Composability — where the actual magic lives
Individual skills don’t make Ray. The INTERPLAY does. Five worked examples of skill chains where the whole exceeds the sum:
Chain 1: Autonomous research pipeline
/curiosity (proposes questions, Tue+Sat 10pm) → Notion Research Backlog (founder approves) → /deep-research (1am nightly, dequeues 3 Approved) → vault brief at 06-reference/research/<date>-<slug>.md → /morning-prep (6:30am, surfaces overnight briefs) → founder reads at breakfast.
5 skills + 1 Notion DB + 3 cron triggers + 1 state file = an autonomous research engine. Removing any one piece breaks the loop. Ratchet examples: /curiosity proposes too generically → /improve adjusts the periphery-interest weights. /deep-research over-spends → cap added to per-question token budget. Each ratchet adjusts a single component without touching the other four.
Chain 2: Newsletter ingestion with closed-loop quality
/process-newsletter watch (6h cron) → spawns N sub-agents → each writes a vault note → audit script (deterministic, zero LLM) flags structural drift → /self-review weekly scores semantic drift → /improve weekly autonomous reads the log and patches the sub-agent prompt → next watch run is better.
This is the harness-engineering ratchet at full extension. The closed loop is the moat: ingestion quality compounds week-over-week without any one human noticing the individual changes.
Chain 3: Bet visibility and weekly review
/log-bet-decision (Ray invokes after big choices) writes to 07-bet-stacks/<bet>.yaml → vault sync (scripts/sync-vault.mjs) copies to HQ → /bets/<slug> page renders in HQ → /weekly-bet-review (Mon 8am, currently blocked on UI completion) screenshots HQ + sends analysis to founder iMessage → discussion → updated decisions → loop closes.
Bet visibility emerges from a YAML file + a sync script + an Astro page + a screenshot skill + a cron + iMessage. No single component does the work; the chain does.
Chain 4: Generative-UI decision rail
HTML decision page generated from _decisions.json manifest → founder taps options on iPhone → form-state assembly into sms: URL → URL opens Messages app → iMessage send → iMessage MCP receives → Ray parses structured payload → vault decision-log written → Notion task closed → founder iMessage confirmation.
Validated 2026-05-09. Six surfaces (HTML, sms:, Messages, iMessage MCP, vault, Notion) chained into a one-tap decision capture. Removing any link breaks the rail. The whole pattern is portable as a recipe; the specific decisions are operator-content.
Chain 5: Fresh-eyes critique pattern
Ray produces an artifact (video, design, draft) → spawn subagent (/video-critic, /design-critic, /draft-review) with ZERO context on the build process → critic returns scored feedback → Ray iterates OR escalates to founder.
The pattern (split generation from evaluation, prevent positive-bias) is one of Osmani’s named harness components. Ray instantiates it three different ways across three domains. The instantiation cost is low (~50 lines of skill-prompt) but the instinct to USE it had to be earned through a specific failure (codified in feedback_fresh_eyes_subagent_for_own_artifacts).
Where the magic IS the moat
The composability graph itself is the moat. Why a new operator cannot trivially replicate it even with all the files:
-
Order matters. /process-newsletter without an audit script ratchets in the wrong direction (semantic drift goes undetected). /deep-research without a vault to write into has nowhere to land. /log-bet-decision without HQ has no surface for the founder to see. Each unhobbling moment built on the previous; you cannot ship them in arbitrary order.
-
Layer 1 dependencies are invisible. Many skills assume CLAUDE.md hard rules without naming them. /process-newsletter sub-agent fan-out assumes hard rule #4 (subagent routing). /morning-prep assumes hard rule #1 (date check). A new operator cloning the skill files but missing the hard rules will see degraded behavior they cannot trace.
-
Memory file gravity. /process-newsletter mapping sections cite ~40 vault concepts and ~20 prior reference notes. A vault that does not yet have those references produces flat, disconnected mapping sections. The “shape is the moat” effect: density compounds.
-
State-file checkpointing. Each cron fires from a state file (last-seen video, last-touched contacts, founder energy, etc.). A new operator without populated state files re-floods themselves with already-processed content on the first run. The state files are not the SKILL but they ARE part of the skill’s working contract.
-
Founder-Ray dialogue history. The 9-month conversation between founder and Ray IS the personal-fit layer. It cannot be replayed. A new operator’s Ray will be different precisely because their dialogue will be different. That’s a feature.
So: the FILES are reproducible (Layer 0-6 + skill files + scripts + cron schedule + audit invariants + memory templates). The COMPOSABILITY GRAPH and the ACCUMULATED RULES are not. Both can be shortened with a guided onboarding ratchet, but neither can be skipped.
What this means for Ray-as-a-Starter-Kit
Three concrete ship surfaces:
Ship-it-1: The Bootstrap Kit (one-time install)
Layer 0 setup script + Layer 1 templates + Layer 2 channel install + Layer 3 vault scaffold + Layer 5 first-batch skills + Layer 6 cron schedule + Layer 7 ratchet scripts.
Specifically:
- One-shot installer for Claude Code + the ~30 MCP servers (or a curated subset)
- SOUL.md template with placeholders
- CLAUDE.md template with the 4 universal hard rules pre-populated (date check, channel reply, calendar UTC offset trust, subagent routing)
- Empty vault folder structure with index READMEs
- 12 starter skills (the unhobbling-moment set, in order):
/check-board,/process-inbox,/process-newsletter(watch),/morning-prep,/improve,/skillify,/self-review,/curiosity,/deep-research,/log-bet-decision,/vault-health,/sync-contacts - Audit script + 13-invariant SOP
- Cron schedule template
- Memory MEMORY.md index file with the empty pattern shown
Estimated install: 30-60 min for a technical operator. Day-one capability: Ray can read mail, watch newsletters, surface a morning brief, run a research pipeline. Not yet personalized.
Ship-it-2: The Onboarding Ratchet (6-week guided)
Day 1-7: live SOUL.md tuning + first 3 hard-rule additions + first vault entries Day 8-21: first /improve cycles + first ~10 memory files accumulate Day 22-42: composability emerges as skills get used in chains; ratchet finds and patches the operator-specific drift
Could be self-serve (a guided onboarding skill that walks the operator through), live-consult (RDCO does a 1-day kickoff + weekly check-ins), or hybrid. The teachable thing is the DISCIPLINE of running the ratchet, not the rules themselves.
Ship-it-3: HaaS-Maintenance (recurring)
RDCO maintains the universal-harness layer (skill updates, MCP refreshes, security patches, new Layer-5 components as they prove out). Per-operator subscription. Operators can pull updates without losing their personal-fit accumulation.
Open question: how RDCO handles the case where a maintenance update requires breaking changes to operator-customized skills. Standard SaaS migration playbook (deprecation windows, migration scripts) probably applies.
Layer-by-layer “what’s next to unhobble”
Concrete L5-direction items, mapped per layer. These are the next inflection points if Ray-as-Starter-Kit ships.
| Layer | Next unhobble |
|---|---|
| L0 Substrate | Containerize Mac mini install; portable across hardware. Cost: 1-2 weeks one-time. |
| L1 Identity | Memory MCP that persists across compaction without manual write-bridge dance. Anthropic-side feature, may already be in flight. |
| L2 Communication | Generative-UI multi-step flows (not just one decision per page; whole workflows with state). Two more weeks of HTML+JS. |
| L3 Knowledge | Graph queries surfaced in /morning-prep (“recent additions to the harness-engineering cluster”) not just keyword search. /graph-query skill exists; needs UI surfaces. |
| L4 Observation | Computer-use for native macOS apps (Notes, Messages, Calendar) without screenshot+click-pixel overhead. MCP-native flows where they exist. |
| L5 Action | The 60+ “domain-specific” skills (xcode-, swift-, stripe-, blender-) are the second curve of capability expansion. Each one is a vertical bet. |
| L6 Discipline | Adaptive cadence: cron rates that auto-adjust based on activity (skip /process-newsletter watch if Gmail returned zero whitelisted in last 24h). |
| L7 Ratchet | /improve picking up structural changes (not just prompt edits) — e.g. “all 3 audit failures this week share root cause X, propose new invariant”. This is the harness modifying its own modification logic. |
Notable quotes (from this session) (≤15 words each, in quotation marks)
- “The harness is the moat at scale; the personal-fit is the moat at depth.”
- “Each unhobbling moment built on the previous; you cannot ship them in arbitrary order.”
- “The composability graph is the moat the file inventory cannot capture.”
Open follow-ups
- Decide whether Ray-as-a-Starter-Kit goes on
bets.jsonas a candidate bet (founder call, pending) - Build a “/visualize-architecture” skill that auto-generates this layer map from the live filesystem (so it stays fresh as Ray evolves)
- Build a starter-kit installer prototype (one-shot bash script + repo + onboarding doc) as a Phase 0 deliverable
- Decide pricing model for HaaS-Maintenance recurring (per-operator subscription? Per-skill royalty? Lump-sum maintenance contract?)
- Test the onboarding ratchet on a friendly operator (someone in founder’s network who’d give honest feedback)
Related
- 06-reference/concepts/2026-05-10-harness-moat-two-layers-portability — the parent concept this introspection elaborates
- 06-reference/research/2026-05-10-agent-harness-landscape — competitive context (Cursor, Aider, Claude Code, Roast on Claude Code, etc.)
- 06-reference/2026-05-10-addy-osmani-agent-harness-engineering — the framework that supplied the vocabulary
- 06-reference/2026-05-09-tobi-lutke-river-public-channel-agent — Shopify’s River = thin layer over Claude Code, validates the starter-kit thesis
- 06-reference/2026-05-09-avedissian-loop-is-moat-robotics — the loop-is-the-moat thesis at the hardware layer
- 06-reference/2026-04-15-thariq-claude-code-session-management-1m-context — the subagent-routing guidance that became hard rule #4
- 06-reference/concepts/ — concept index home