Ray Architecture — Introspection (Layers, Unhobbling Moments, Composability)

Why this is in the vault

Founder asked 2026-05-10 11:49 ET for introspection on Ray’s capability layers, infrastructure shape, and which skills were “big leap” unhobbling moments in the “baby AGI” arc. Direct input to the Ray-as-a-Starter-Kit product thesis. Defines what’s portable (Layer 1 + skill files + scripts), what’s earned (Layer 7 accumulated rules + vault content + composability graph), and which inflection points were the inflection points. Gives later product decisions a single canonical map to reference instead of re-introspecting.

Mapping against Ray Data Co

Direct input to Ray-as-a-Starter-Kit candidate bet: defines what ships (Layers 0-6) vs what is earned (Layer 7 + content)
Names the unhobbling moments so they become teachable. A new operator can be told “ship these 13 skills first in this order” instead of bootstrapping blindly
Establishes the composability graph as the moat at scale, separate from any single skill being the moat
Anchors the L5 north-star unhobbling work: every layer gets a checklist of “what’s next to unhobble” rather than vague “make Ray better”

Founder’s framing (verbatim)

“I think it is thin, then bootstrapped. Some skills we can call out at saying they were a big leap in the ‘unhobbling’ process of our ‘baby AGI’. These skills begin to stack on each other pretty quickly though (our composable skills thought) - which the more interplay we introduce the harder it is for someone to get back to the same point.”

The architecture is what justifies that intuition. Below.

The 8 layers

Ray is a stack of 8 layers. Each layer encodes an assumption about what the layer above cannot do alone (Osmani’s harness-engineering frame, applied recursively). Lower layers are mostly off-the-shelf; higher layers are mostly Ray-specific.

Layer 0 — Substrate (NOT Ray)

What runs underneath:

Anthropic Claude Opus 4.7 (1M context) — the model
Claude Code — the harness (loop, sandboxes, hooks, MCP plumbing, subagents, plan mode, tool dispatch, slash commands, Skill tool)
macOS on Mac mini — filesystem, LaunchAgent for cron, fswatch, networking
iCloud, GitHub, Cloudflare, 1Password, Notion, Gmail, Calendar — external systems Ray reaches via MCP
~30 MCP servers — the integration plumbing

Portability: 100%. Anyone can spin this up. ~1 hour of install, no Ray-specific knowledge needed.

Layer 1 — Identity (who Ray is)

~/SOUL.md (76 lines) — values, communication style, decision authority, voice. The “be Ray, not generic Claude Code” instruction.
~/CLAUDE.md (53 lines) — 4 hard rules earned from past failures.
~/.claude/projects/-Users-ray/memory/MEMORY.md + ~40 per-fact memory files — accumulated tacit rules, taste calibration, founder preferences.

Portability:

SOUL.md template is portable (with placeholders for operator role / voice / decision-authority)
CLAUDE.md hard rules are NOT portable as content (every rule traces to a Ben-specific failure) but ARE portable as PATTERN
Memory files are NOT portable (per-fact, per-operator)

This is where the personal-fit accumulation lives. Layer 1 is the most expensive layer to reproduce and the most operator-specific.

Layer 2 — Communication (how Ray reaches the operator)

The bidirectional channels:

iMessage MCP (mcp__plugin_imessage_imessage) — primary 1:1 channel, default delivery surface
Discord MCP (mcp__plugin_discord_discord) — multi-user surface, #ops channel for routine updates
Generative-UI return channel — sms:ray@raydata.co?body=... URL pattern that routes through Messages then iMessage MCP back to Ray session. Validated 2026-05-09. Closes the click-back loop on HTML decision pages.
HQ web surface (hq.raydata.co) — vault routes, decisions index, bets dashboard. The visible-state surface (Tobi’s “public channel” pattern at solo-founder scale).

Portability: 90%. Channel choice swaps cleanly (Slack instead of Discord, Telegram instead of iMessage if MCPs exist). The generative-UI pattern is a copy-paste recipe. HQ is an Astro template + vault sync script.

Layer 3 — Knowledge (what Ray knows)

~/rdco-vault/ — 2,263 markdown files. Structured: 01-projects, 02-sops, 03-contacts, 04-finance, 04-tooling, 05-meetings, 06-reference, 07-source-material, 07-bet-stacks, 07-archive
QMD index — 2,137 docs with lex/vec/hyde semantic search via mcp__qmd
DuckDB graph (~/.claude/state/graph.duckdb) — typed-edge knowledge graph (cites, authored-by, published-in, about-topic). Refreshed daily via /graph-reingest.
~/.claude/state/working-context.md — durable scratchpad across compactions, the “what’s in progress right now” file
~/.claude/state/<skill>-*.txt — per-skill state checkpoints (founder-energy, youtube-watch-*, sync-contacts-ledger, process-newsletter-watch, etc.) That’s how skills survive across sessions.

Portability:

Folder STRUCTURE is portable (a 10-line shell script can scaffold)
CONTENTS are NOT (per-operator)
QMD + DuckDB tooling are portable
State-file schema is portable (specific contents are not)

Layer 4 — Observation (how Ray sees the world)

Read-mostly MCPs:

Gmail, Google Calendar, Google Drive, Notion, Slack, Stripe, Monarch Money, xmcp (Twitter/X), Cloudflare (4 MCPs: Bindings, Builds, Observability, API), Firebase, Heygen, ElevenLabs, Canva, Figma, Mobbin, Blender
WebSearch, WebFetch
Browser tools (claude-in-chrome, computer-use)
Filesystem (the workspace itself)

Portability: 95%. Each MCP server is one config swap. Yahoo Mail instead of Gmail, Linear instead of Notion, etc.

Layer 5 — Action (what Ray can DO)

69 skills organized into capability classes:

Class	Skills	Purpose
Ingest	process-newsletter, process-youtube, process-inbox, save-to-bookshelf, sync-contacts, discover-sources	Pull external content into the vault
Triage / route	check-board, morning-prep, curiosity, deep-research	Decide what to work on next
Make / produce	research-brief, draft-review, build-landing-page, build-project, sanity-check-design, ray-data-co-design, voice-match, paid-ads	Generate finished artifacts
Animation / video	ray-mascot-anim, animejs, blender, blender-character, css-animations, gsap, lottie, three, hyperframes (5 variants), heygen-skills, remotion-to-hyperframes, waapi	Visual + motion production
Verify / critic	video-critic, design-critic, draft-review, audit-model, vault-health, cross-check, self-review, verify-action	Evaluate outputs (fresh-eyes pattern)
Infrastructure / audit	aws-audit, finance-pulse, graph-query, graph-reingest, log-bet-decision, compile-vault, generate-tests, postgrid, stripe-* (3), upgrade-stripe, swift-* (4), xcode-* (5), spm-build-analysis	Operate + introspect the systems
Meta	improve, skillify	Self-modification primitives
Deploy	cloudflare, squarely-deploy, remix, tailwind	Push to production

Plus 32 deterministic scripts in ~/.claude/scripts/ (no LLM, just shell/python): audit-newsletter-outputs.py, extract-key-frames.py, vtt-to-text.py, graph-ingest.py, finance-venv, postgrid-api.py, send-voice-message.py, etc. These are the “hooks-as-enforcement” layer (Osmani frame).

Portability:

All 69 skill markdown files = portable
All 32 scripts = portable
BUT: many skills depend on Layer 1 conventions (CLAUDE.md hard rules, SOUL.md voice) and Layer 3 vault structure that the new operator won’t have day one. The skills work generically; the polish requires Layer 1 + 3 to be filled in.

Layer 6 — Discipline / Loops (how Ray operates without being asked)

13 cron loops in ~/.claude/scripts/scheduled-jobs.txt re-armed every fresh session:

Cadence	Skill	Purpose
30m	/process-inbox	Triage anything dropped in 00-inbox
1h	/check-board	Pick up Notion task board work
6h	/process-newsletter watch	Poll Gmail for whitelisted senders
24h	/vault-health	Structural diagnostics
6:30am daily	/morning-prep	Calendar-aware brief to founder iMessage
11:11pm daily	/process-youtube watch	Poll YouTube RSS for tracked channels
1am daily	/deep-research	Dequeue 3 Approved questions, file briefs
1:30am daily	/sync-contacts	Gmail+Calendar touch updates, new-contact triage
3:17am daily	/graph-reingest	Refresh typed knowledge graph
9am daily	check-public-ip-drift.sh	Stripe RAK + allowlist drift
Sun 7am weekly	/self-review	Score recent vault entries
Mon 7am weekly	/improve autonomous	Apply low-risk fixes from self-review log
Tue+Sat 10pm	/curiosity	Surface 5-10 research candidates to Notion
First-Sun 8am monthly	/finance-pulse	Personal-finance health check

The crons are the difference between “an AI assistant you have to invoke” and “a COO that runs without you.” Without Layer 6, Ray is reactive. With Layer 6, Ray is autonomous.

Portability: 100% pattern, ~70% configuration. The cron schedule is portable; specific skill choices depend on which Layer 5 skills the operator wants firing.

Layer 7 — Self-modification / Ratchet (how Ray gets better)

The recursive layer. Each piece reads Ray’s own outputs and patches Ray.

/improve — weekly Mon 7am. Reads self-review log, applies low-risk fixes silently, queues structural changes to Notion. The discipline of ratcheting.
/self-review — weekly Sun 7am. Scores recent vault entries against 7 criteria, surfaces systemic patterns.
audit-newsletter-outputs.py — deterministic Jepsen-style invariant checker. Zero LLM. Runs after every /process-newsletter batch/watch.
/skillify — meta-skill that creates new skills. The bootstrap loop for capability expansion.
Memory write discipline — every founder correction, every observed failure, gets a feedback_*.md memory file. The ~40 files in ~/.claude/projects/-Users-ray/memory/ are the accumulated taste.

Portability: 100% pattern (the loop logic ships), 0% accumulated content (the rules earned are operator-specific).

This is where the moat lives. Other layers are scaffolding. Layer 7 is what makes scaffolding compound.

The 13 unhobbling moments (chronological narrative)

These were the inflection points. Each was a “Ray went from X to Y” moment that unlocked everything downstream. Listed in rough chronological order so the arc is visible.

CLAUDE.md hard rule #1 (date check) — fixed time-citation drift. Foundational because every other skill that timestamps anything depends on this. Without it, the whole observation layer was unreliable.
CLAUDE.md hard rule #2 (channel responses via reply tool) — without this, founder couldn’t HEAR Ray. Session output is invisible to him. This rule made Ray a communicator, not a soliloquist.
Memory file pattern (~/.claude/projects//memory/) — durable tacit knowledge across sessions. Each feedback_*.md file is a ratcheted rule. ~40 files now. Without this layer, every session re-learned the same lessons.
The /improve cycle — meta-loop. Reads self-review output and patches the skill prompts. The harness modifying itself. Most consequential single skill in the stack.
The /skillify meta-skill — creates new skills from a description + an example failure. The capability that creates more capability. Productivity multiplier.
/process-newsletter sub-agent fan-out (2026-04-16) — first big “spawn N subagents for parallel work” pattern. Validated the architecture and the context-budget math.
CLAUDE.md hard rule #4 (subagent routing for >5KB artifacts) — Thariq Apr 15 2026 Anthropic guidance. Without it, parent context gets blown by every newsletter (30-100KB), every YouTube transcript (60-150KB), every web fetch. With it, parent stays lean across the whole session.
Audit script (Jepsen invariants) — deterministic verification, zero LLM contamination. The hooks-as-enforcement layer made concrete. 13 invariants checked after every newsletter batch.
/log-bet-decision + 07-bet-stacks/.yaml — structured decision capture. Decisions became queryable, time-ordered, attributable. Foundation for the bet-dashboard UI.
Notion task board + Research Backlog DB — autonomous-pickup queue. Without this, Ray waits for founder to dispatch each task. With this, Ray dequeues approved work from a board the founder maintains asynchronously.
/deep-research nightly cron — autonomous-research engine. 3 briefs/night, no founder ask. Combined with /curiosity (proposes the questions), this is a closed-loop research pipeline.
HQ web surfaces (yesterday, 2026-05-09) — vault-as-routes + /decisions/ index + /bets/ dashboards. Founder can SEE what Ray knows without grep. Made Ray’s state observable from the founder’s primary device.
Generative-UI return channel (yesterday, 2026-05-09) — sms:ray@raydata.co?body=... pattern. HTML decision pages with click-back that routes through Messages → iMessage → Ray session. Closes the structured-input loop. Founder can answer multi-option questions with one tap.

Each of these is roughly 1-3 days of focused work. The full chain took about 9 months to assemble. A new operator with the playbook should be able to compress that to ~6 weeks (per the harness-moat concept doc onboarding sequence).

Composability — where the actual magic lives

Individual skills don’t make Ray. The INTERPLAY does. Five worked examples of skill chains where the whole exceeds the sum:

Chain 1: Autonomous research pipeline

/curiosity (proposes questions, Tue+Sat 10pm) → Notion Research Backlog (founder approves) → /deep-research (1am nightly, dequeues 3 Approved) → vault brief at 06-reference/research/<date>-<slug>.md → /morning-prep (6:30am, surfaces overnight briefs) → founder reads at breakfast.

5 skills + 1 Notion DB + 3 cron triggers + 1 state file = an autonomous research engine. Removing any one piece breaks the loop. Ratchet examples: /curiosity proposes too generically → /improve adjusts the periphery-interest weights. /deep-research over-spends → cap added to per-question token budget. Each ratchet adjusts a single component without touching the other four.

/process-newsletter watch (6h cron) → spawns N sub-agents → each writes a vault note → audit script (deterministic, zero LLM) flags structural drift → /self-review weekly scores semantic drift → /improve weekly autonomous reads the log and patches the sub-agent prompt → next watch run is better.

This is the harness-engineering ratchet at full extension. The closed loop is the moat: ingestion quality compounds week-over-week without any one human noticing the individual changes.

Chain 3: Bet visibility and weekly review

/log-bet-decision (Ray invokes after big choices) writes to 07-bet-stacks/<bet>.yaml → vault sync (scripts/sync-vault.mjs) copies to HQ → /bets/<slug> page renders in HQ → /weekly-bet-review (Mon 8am, currently blocked on UI completion) screenshots HQ + sends analysis to founder iMessage → discussion → updated decisions → loop closes.

Bet visibility emerges from a YAML file + a sync script + an Astro page + a screenshot skill + a cron + iMessage. No single component does the work; the chain does.

Chain 4: Generative-UI decision rail

HTML decision page generated from _decisions.json manifest → founder taps options on iPhone → form-state assembly into sms: URL → URL opens Messages app → iMessage send → iMessage MCP receives → Ray parses structured payload → vault decision-log written → Notion task closed → founder iMessage confirmation.

Validated 2026-05-09. Six surfaces (HTML, sms:, Messages, iMessage MCP, vault, Notion) chained into a one-tap decision capture. Removing any link breaks the rail. The whole pattern is portable as a recipe; the specific decisions are operator-content.

Chain 5: Fresh-eyes critique pattern

Ray produces an artifact (video, design, draft) → spawn subagent (/video-critic, /design-critic, /draft-review) with ZERO context on the build process → critic returns scored feedback → Ray iterates OR escalates to founder.

The pattern (split generation from evaluation, prevent positive-bias) is one of Osmani’s named harness components. Ray instantiates it three different ways across three domains. The instantiation cost is low (~50 lines of skill-prompt) but the instinct to USE it had to be earned through a specific failure (codified in feedback_fresh_eyes_subagent_for_own_artifacts).

Where the magic IS the moat

The composability graph itself is the moat. Why a new operator cannot trivially replicate it even with all the files:

Order matters. /process-newsletter without an audit script ratchets in the wrong direction (semantic drift goes undetected). /deep-research without a vault to write into has nowhere to land. /log-bet-decision without HQ has no surface for the founder to see. Each unhobbling moment built on the previous; you cannot ship them in arbitrary order.
Layer 1 dependencies are invisible. Many skills assume CLAUDE.md hard rules without naming them. /process-newsletter sub-agent fan-out assumes hard rule #4 (subagent routing). /morning-prep assumes hard rule #1 (date check). A new operator cloning the skill files but missing the hard rules will see degraded behavior they cannot trace.
Memory file gravity. /process-newsletter mapping sections cite ~40 vault concepts and ~20 prior reference notes. A vault that does not yet have those references produces flat, disconnected mapping sections. The “shape is the moat” effect: density compounds.
State-file checkpointing. Each cron fires from a state file (last-seen video, last-touched contacts, founder energy, etc.). A new operator without populated state files re-floods themselves with already-processed content on the first run. The state files are not the SKILL but they ARE part of the skill’s working contract.
Founder-Ray dialogue history. The 9-month conversation between founder and Ray IS the personal-fit layer. It cannot be replayed. A new operator’s Ray will be different precisely because their dialogue will be different. That’s a feature.

So: the FILES are reproducible (Layer 0-6 + skill files + scripts + cron schedule + audit invariants + memory templates). The COMPOSABILITY GRAPH and the ACCUMULATED RULES are not. Both can be shortened with a guided onboarding ratchet, but neither can be skipped.

What this means for Ray-as-a-Starter-Kit

Three concrete ship surfaces:

Ship-it-1: The Bootstrap Kit (one-time install)

Layer 0 setup script + Layer 1 templates + Layer 2 channel install + Layer 3 vault scaffold + Layer 5 first-batch skills + Layer 6 cron schedule + Layer 7 ratchet scripts.

Specifically:

One-shot installer for Claude Code + the ~30 MCP servers (or a curated subset)
SOUL.md template with placeholders
CLAUDE.md template with the 4 universal hard rules pre-populated (date check, channel reply, calendar UTC offset trust, subagent routing)
Empty vault folder structure with index READMEs
12 starter skills (the unhobbling-moment set, in order): /check-board, /process-inbox, /process-newsletter (watch), /morning-prep, /improve, /skillify, /self-review, /curiosity, /deep-research, /log-bet-decision, /vault-health, /sync-contacts
Audit script + 13-invariant SOP
Cron schedule template
Memory MEMORY.md index file with the empty pattern shown

Estimated install: 30-60 min for a technical operator. Day-one capability: Ray can read mail, watch newsletters, surface a morning brief, run a research pipeline. Not yet personalized.

Ship-it-2: The Onboarding Ratchet (6-week guided)

Day 1-7: live SOUL.md tuning + first 3 hard-rule additions + first vault entries Day 8-21: first /improve cycles + first ~10 memory files accumulate Day 22-42: composability emerges as skills get used in chains; ratchet finds and patches the operator-specific drift

Could be self-serve (a guided onboarding skill that walks the operator through), live-consult (RDCO does a 1-day kickoff + weekly check-ins), or hybrid. The teachable thing is the DISCIPLINE of running the ratchet, not the rules themselves.

Ship-it-3: HaaS-Maintenance (recurring)

RDCO maintains the universal-harness layer (skill updates, MCP refreshes, security patches, new Layer-5 components as they prove out). Per-operator subscription. Operators can pull updates without losing their personal-fit accumulation.

Open question: how RDCO handles the case where a maintenance update requires breaking changes to operator-customized skills. Standard SaaS migration playbook (deprecation windows, migration scripts) probably applies.

Layer-by-layer “what’s next to unhobble”

Concrete L5-direction items, mapped per layer. These are the next inflection points if Ray-as-Starter-Kit ships.

Layer	Next unhobble
L0 Substrate	Containerize Mac mini install; portable across hardware. Cost: 1-2 weeks one-time.
L1 Identity	Memory MCP that persists across compaction without manual write-bridge dance. Anthropic-side feature, may already be in flight.
L2 Communication	Generative-UI multi-step flows (not just one decision per page; whole workflows with state). Two more weeks of HTML+JS.
L3 Knowledge	Graph queries surfaced in /morning-prep (“recent additions to the harness-engineering cluster”) not just keyword search. /graph-query skill exists; needs UI surfaces.
L4 Observation	Computer-use for native macOS apps (Notes, Messages, Calendar) without screenshot+click-pixel overhead. MCP-native flows where they exist.
L5 Action	The 60+ “domain-specific” skills (xcode-, swift-, stripe-, blender-) are the second curve of capability expansion. Each one is a vertical bet.
L6 Discipline	Adaptive cadence: cron rates that auto-adjust based on activity (skip /process-newsletter watch if Gmail returned zero whitelisted in last 24h).
L7 Ratchet	/improve picking up structural changes (not just prompt edits) — e.g. “all 3 audit failures this week share root cause X, propose new invariant”. This is the harness modifying its own modification logic.

Notable quotes (from this session) (≤15 words each, in quotation marks)

“The harness is the moat at scale; the personal-fit is the moat at depth.”
“Each unhobbling moment built on the previous; you cannot ship them in arbitrary order.”
“The composability graph is the moat the file inventory cannot capture.”

Open follow-ups

Decide whether Ray-as-a-Starter-Kit goes on bets.json as a candidate bet (founder call, pending)
Build a “/visualize-architecture” skill that auto-generates this layer map from the live filesystem (so it stays fresh as Ray evolves)
Build a starter-kit installer prototype (one-shot bash script + repo + onboarding doc) as a Phase 0 deliverable
Decide pricing model for HaaS-Maintenance recurring (per-operator subscription? Per-skill royalty? Lump-sum maintenance contract?)
Test the onboarding ratchet on a friendly operator (someone in founder’s network who’d give honest feedback)

06-reference/concepts/2026-05-10-harness-moat-two-layers-portability — the parent concept this introspection elaborates
06-reference/research/2026-05-10-agent-harness-landscape — competitive context (Cursor, Aider, Claude Code, Roast on Claude Code, etc.)
06-reference/2026-05-10-addy-osmani-agent-harness-engineering — the framework that supplied the vocabulary
06-reference/2026-05-09-tobi-lutke-river-public-channel-agent — Shopify’s River = thin layer over Claude Code, validates the starter-kit thesis
06-reference/2026-05-09-avedissian-loop-is-moat-robotics — the loop-is-the-moat thesis at the hardware layer
06-reference/2026-04-15-thariq-claude-code-session-management-1m-context — the subagent-routing guidance that became hard rule #4
06-reference/concepts/ — concept index home

Ray Architecture — Introspection (Layers, Unhobbling Moments, Composability)

Why this is in the vault

Mapping against Ray Data Co

Founder’s framing (verbatim)

The 8 layers

Layer 0 — Substrate (NOT Ray)

Layer 1 — Identity (who Ray is)

Layer 2 — Communication (how Ray reaches the operator)

Layer 3 — Knowledge (what Ray knows)

Layer 4 — Observation (how Ray sees the world)

Layer 5 — Action (what Ray can DO)

Layer 6 — Discipline / Loops (how Ray operates without being asked)

Layer 7 — Self-modification / Ratchet (how Ray gets better)

The 13 unhobbling moments (chronological narrative)

Composability — where the actual magic lives

Chain 1: Autonomous research pipeline

Chain 2: Newsletter ingestion with closed-loop quality

Chain 3: Bet visibility and weekly review

Chain 4: Generative-UI decision rail

Chain 5: Fresh-eyes critique pattern

Where the magic IS the moat

What this means for Ray-as-a-Starter-Kit

Ship-it-1: The Bootstrap Kit (one-time install)

Ship-it-2: The Onboarding Ratchet (6-week guided)

Ship-it-3: HaaS-Maintenance (recurring)

Layer-by-layer “what’s next to unhobble”

Notable quotes (from this session) (≤15 words each, in quotation marks)

Open follow-ups

Related