“How To Build Own Content Engine?” — @deronin_

Why this is in the vault

Founder flagged this as “more thoughts for our content engine.” The article is a practical blueprint for a markdown-based content production system using a “skill graph” — a folder of interconnected .md files with wikilinks, pointed at by an AI agent that can take one topic and produce ten platform-native posts.

For Ray Data Co specifically, the blueprint is immediately actionable because our SOUL.md + CLAUDE.md + ~/.claude/skills/ pattern is already 80% of this architecture. We’ve been doing the skill-graph thing for research, operations, and automated investing. We haven’t turned it toward content production — but the founder has real content surfaces (Sanity Check newsletter, X presence, Squarely/Data Dots marketing, future phData-related content) that this pattern would serve well.

The core thesis

Most people use AI for content by opening Claude, typing “write me a LinkedIn post about X,” and spending 20 minutes making the generic output not sound like a corporate intern. The author’s framing: that’s not a system, it’s a chore with extra steps.

The problem isn’t the AI. It’s that a single prompt gives the model zero context about your brand, voice, audience, platform strategies, or how any of that connects. You’re hiring a genius with amnesia every time you start a new chat.

The fix is a skill graph — a folder of interconnected markdown files where each file is a “knowledge node” representing one piece of the content system’s brain. Files reference each other via [[wikilinks]] so the AI agent follows the links and builds up a complete understanding of brand, voice, audience, platform rules, hook formulas, and repurposing logic before writing a single word.

The author’s one-liner: a single prompt gives you a tool; a graph of 30+ interconnected .md files gives you a team — sub-specialists for every platform, hook type, voice variant, and audience segment.

The folder structure the author uses

17 files organized into 4 folders:

content-engine/
├── index.md                      # the briefing / command center
├── platforms/
│   ├── x.md                      # X/Twitter playbook
│   ├── linkedin.md
│   ├── instagram.md
│   ├── tiktok.md
│   ├── youtube.md
│   ├── threads.md
│   ├── facebook.md
│   └── newsletter.md
├── voice/
│   ├── brand-voice.md            # universal brand DNA
│   └── platform-tone.md          # how voice adapts per platform
├── engine/
│   ├── hooks.md                  # hook patterns library
│   ├── repurpose.md              # 1-idea → 10-posts pipeline
│   ├── scheduling.md
│   └── content-types.md
└── audience/
    ├── builders.md               # segment A
    └── casual.md                 # segment B

Key insights worth internalizing

The index.md is not a table of contents

Author’s warning: the most common mistake is making index.md a file list. It’s a briefing — not a TOC. It tells the agent who you are, what the system does, and how to execute. Every content task the agent gets starts with reading this file.

The identity section specifically calibrates everything downstream — “AI automation, SaaS building, and monetizing tech skills” produces completely different content than “vegan meal prep for busy parents.” Be aggressively specific about the niche.

The node map in index.md should include context

Not [[x]] — Twitter but [[x]] — short-form, hook-driven, 280 chars max, casual lowercase, 5x/week minimum. The extra context lets the agent make routing decisions without opening every file for every task, saving tokens and speeding up inference.

”Rethink, don’t reformat” is the whole point

Author’s strongest example: for the topic “How I use AI to manage 10 social media accounts”:

X: contrarian thread, lowercase casual, step-by-step hook
LinkedIn: personal narrative, professional tone, ~1500 words, cost-savings framing
Instagram: 7-slide carousel, visual-first, bold claim on slide 1
TikTok: 45-second raw screen recording script with hook at 2 seconds
YouTube: SEO title + structured outline, 8-minute format, tutorial style
Newsletter: 1500-word deep dive with curtain-pull framing
Threads: hot take, conversational
Facebook: community discussion question

Same topic. Eight completely different pieces of content. Each native to its platform, each valuable even to someone who follows you everywhere.

Hooks are where 80% of performance is decided

Dedicated hooks.md file with reusable patterns. The first line determines whether anyone reads the rest. The author’s framing: the most incredible post ever won’t be seen if the first line doesn’t stop the scroll — “like a perfect restaurant in a back alley with no sign.”

The system grows by encoding performance data back into the files

Author’s iteration loop: every week, update hooks.md with what’s performing, refine platform-tone.md as you learn what sounds right, add platform files when ready. The graph gets smarter because learnings are encoded into the files themselves. Compound interest but for content.

Three ways to run it

Claude Projects — upload all files to project knowledge, conversations have persistent context. Author’s recommendation.
Paste context — copy index.md into any AI chat, add platform files + brand-voice as needed. Works anywhere, zero setup.
Cursor or Claude Code — point at the local folder, agent reads files directly, can UPDATE files autonomously to encode learnings. “Most powerful, most technical” — this is the fully autonomous version.

Mapping this against what Ray Data Co already has

We’re already running a skill graph — just not for content. Our current graph:

~/rdco-vault/SOUL.md — identity + decision authority + operating model (equivalent to brand-voice.md)
~/rdco-vault/CLAUDE.md — project-level instructions + summary-instructions section (equivalent to index.md’s execution block)
~/rdco-vault/01-projects/ — project-level context with wikilinks throughout (equivalent to platforms/ for ops, not content)
~/rdco-vault/06-reference/ — the research/knowledge layer
~/.claude/skills/ — operational skills (equivalent to engine/ for ops)
~/.claude/projects/-Users-ray/memory/ — cross-session memory

What we’re missing for a content-specific skill graph:

content-engine/ [does not exist yet]
├── index.md
├── platforms/
│   ├── x.md                    # Ben's X presence — AI/quant/founder positioning
│   ├── linkedin.md             # phData professional angle
│   ├── newsletter.md           # Sanity Check newsletter format
│   ├── discord.md              # RDCO Discord channel voice (internal)
│   └── imessage.md             # how I talk to the founder (already in SOUL.md)
├── voice/
│   ├── ben-voice.md            # Ben's actual voice (founder POV, first-person)
│   ├── ray-voice.md            # Ray COO voice (already in SOUL.md)
│   └── platform-tone.md
├── engine/
│   ├── hooks.md                # reusable first-line patterns
│   ├── repurpose.md            # 1 topic → N posts pipeline
│   └── content-types.md        # deep-dive vs hot take vs thread vs carousel
└── audience/
    ├── quant-builders.md        # technical audience
    ├── consulting-buyers.md     # phData-adjacent buyers
    └── casual-ai-curious.md     # broader AI interest audience

Existing content-adjacent pieces we can lift into this graph:

The Jaya Gupta article cross-link pattern I just did (2026-04-10-jaya-gupta-anthropic-moat) — this IS a platform-native piece, filed with cross-links
The weekly research articles getting processed into the vault — those are the raw material for “what is Ben thinking about”
The phData negotiation framework — already has Ben’s voice in the draft reply scripts
SOUL.md communication rhythms — already defines “a good update from me sounds like X”

What would be different for us

The author writes for a “growth marketing / automation consultant” audience where the content IS the product. Ray Data Co is different:

Content is a side-effect, not the core product. Our main surfaces are the Sanity Check newsletter + Ben’s X presence + future phData thought leadership. Content volume is 5-10 posts/week across 2-3 platforms, not 10 accounts across 10 platforms.
Ben is the primary voice, not me. The author runs the content from one persona. For us, most publishable content is in Ben’s voice (X, LinkedIn, newsletter byline) with Ray as ghostwriter/editor. The voice files need to encode Ben’s first-person voice explicitly, not mine.
We have a longer source pipeline than most practitioners. Every research article we process into 06-reference/ is potential raw material. The repurposing chain for us isn’t “1 topic → 10 platforms” but “1 filed research doc → 1 X thread + 1 newsletter segment + 1 LinkedIn post + 1 vault concept note.”
The audience segments are narrower and sharper. We don’t serve “casual AI curious” — we serve technical operators, consulting buyers, and the specific Ray Data Co narrative arc. 3-4 audience files, not 10.
Discipline gates apply. Per the 2026-04-10-jaya-gupta-anthropic-moat framing, publishing content is crossing an “advise → operate” boundary. The content engine should have a review gate before anything ships under Ben’s name — my job is draft, his job is approve.

Proposed implementation (if the founder wants to build it)

Stage 1 (this session if requested, ~30 min): Create rdco-vault/10-content-engine/ with the skeleton folders and empty files. Fill in index.md, voice/ben-voice.md, voice/ray-voice.md (the latter is mostly SOUL.md already), and engine/repurpose.md. Stop there.

Stage 2 (separate session): Fill in the three platform files that matter most — platforms/x.md, platforms/linkedin.md, platforms/newsletter.md. These require the founder’s input on his voice, what’s worked in past posts, what he’s trying to signal.

Stage 3 (ongoing): Every time we process a research article into 06-reference/, run it through the repurpose pipeline and produce a draft X thread + newsletter segment. Founder approves or rewrites. The hooks and tone files get refined weekly based on what’s performing.

Stage 4 (later): Audience segment files, content calendar, and autonomous operation through Claude Code (which we’re already running — the content engine would just be another specialized skill in the stack).

Copyright note

Per vault rules — direct quotes from the source are limited to short citation-length fragments. The author is clear and his framing is useful; my summary above is in my own words throughout. The folder structure specifics are generic enough that building a similar system doesn’t infringe anything, and the author explicitly encourages readers to build it.

2026-04-10-jaya-gupta-anthropic-moat — permission/trust as the scarce asset; content publishing is a meaningful trust boundary for us
2026-04-10-akshay-pachaar-agent-harness-anatomy — the harness anatomy that describes what a skill-graph-powered agent actually is
2026-04-10-ramp-labs-latent-briefing — the efficiency primitive for scaling multi-agent systems
../rdco-vault/SOUL.md — our existing voice-DNA file (equivalent to brand-voice.md, for Ray as COO)
../rdco-vault/CLAUDE.md — our existing index/briefing file
../01-projects/automated-investing/architecture-vision — the 5-agent target; content production could be a dedicated agent in a future stage

Tracked author

@deronin_ — handle hasn’t been followed before. Description sparse (just content automation practitioner). Add to CRM with a note that we’re tracking for content-engine methodology, not as a business contact.