06-reference

smart ape md vs html three questions

Fri May 08 2026 20:00:00 GMT-0400 (Eastern Daylight Time) ·reference ·source: X long-form article by @the_smart_ape ·by The Smart Ape (@the_smart_ape) — pseudonym
html-vs-markdownagent-output-formathybrid-patternclaude-codeformat-decision-frameworkhyperframes-validation

“md or html?” — @the_smart_ape

Why this is in the vault

Founder shared 2026-05-09 ~10:24 ET as a side-quest while we were mid-build on the /vault/ KB-as-routes feature. Same week as Thariq’s “Unreasonable Effectiveness of HTML” piece (filed 2026-05-08). Where Thariq argued HTML wins for human-facing artifacts, this piece pushes back with a more nuanced framework: format choice is a function of audience + lifecycle + horizon, and the load-bearing insight is the hybrid pattern (one md canonical → many html derivatives), not the binary.

Validates RDCO’s emerging architecture (vault = md canonical, /decisions and /vault/ HQ routes = html derivatives) and gives us a 3-question test to apply to future artifact-format choices.

Author note

@the_smart_ape is a pseudonym account. Modest reach (41 likes, 21k impressions, 9 RTs at fetch). No verifiable institutional affiliation. The framework reads as legitimate working-developer experience, not vendor pitch — internal consistency is high. Treat as one well-articulated practitioner perspective, not authoritative spec.

The framework

Every doc has three properties. Each votes md or html. When all three align, format picks itself. When they split, use the hybrid pattern.

Question 1 — Who actually reads this?

If Claude re-ingests it in a future session, html eats the context window:

Rule: doc is reference Claude will read again → md, no exceptions. One-time human deliverable → html on the table.

Question 2 — How many times will this get edited?

Claude does not edit html the way it edits md:

Author calls this markup drift.

Rule: edited >2-3 times → md or hybrid. HTML is a publication format, not an iteration format.

Question 3 — How long does this thing live?

Two tests:

Grep test. In 6 months, can you grep -r "pricing" and find this doc? In md, yes. In html, you’re searching for <h2>pricing</h2> and missing <h2 class="section-title" data-anchor="pricing">pricing</h2>. Doc exists, can’t find it.

Survivability test. Open the html file in 3 years. Does the CDN link to Tailwind still work? Did Lucide change its API? Did Chart.js bump majors? A md file from 2015 still opens cleanly. An html file from 2015 with 3 CDN imports does not.

Rule: ephemeral / demonstrative / one-shot → html. Retrieved / indexed / outlives author → md.

The hybrid pattern (load-bearing)

The move best teams use: write the canonical doc in md. Generate html views on demand for specific audiences.

Example: one architecture.md becomes:

Each html is a derived artifact. Regenerate when source changes. Md stays reviewable, indexable, greppable. Html does presentation lift.

Setup is “brutally simple: a 10-line script that pipes your md into Claude with three different system prompts. No infrastructure, no lock-in, no commitment to a format.”

Author’s claim: “teams that get this never talk about it because it sounds too obvious. it isn’t. most teams duplicate content across formats and watch them drift apart in 3 weeks.”

The 30-second reversibility test

Before committing to html, ask: if I had to convert this back to clean markdown right now, could I do it in one prompt?

If no → content is mixed with markup. That’s the warning sign. Content should always be cleanly extractable from the format. The day you can’t separate them is the day the document becomes unmaintainable.

Author claims this catches 90% of bad format choices in 30 seconds.

Three legitimate exceptions to default-md

  1. Sales demo or external one-pager where the aesthetic IS the content
  2. True one-shot where you literally never touch the file again
  3. Interactive prototype where the interaction IS the deliverable

Mapping against Ray Data Co

Strong validation for RDCO’s emerging architecture. Founder articulated his read as “vault is md, ephemeral decisions are html” — directionally right, but misses the hybrid pattern as the third bucket.

What we got right (per the framework)

What we already do that matches the hybrid pattern

Where the article tightens our forward design

When we build /bets// dashboards (the A in C→A queued for next):

The Thariq tension (resolved)

Thariq’s “Unreasonable Effectiveness of HTML” (filed 2026-05-08) argued HTML wins for founder-facing artifacts because it gets read; >100-line markdown doesn’t. The Smart Ape says HTML breaks edit cycles + retrieval + grep + survival. Both are right, and the hybrid pattern resolves the tension:

Both writers are arguing one slice of the same truth.

Notable quotes (≤15 words each, in quotation marks)

Open follow-ups

Source caveat

Article body retrieved via xmcp getPostsById with tweet.fields: ["article", ...] + expansions: ["article.cover_media", "article.media_entities"] — same fetch path validated for Thariq’s piece on 2026-05-08 and Dan Farrelly’s piece same week. The article.plain_text field returns full body. Author is anonymous; treat individual claims (3x token cost, 15-25% retrieval degradation) as illustrative figures from his own work, not validated benchmarks.