Ben Wilson — Voice Profile for Sanity Check v3

Synthesized from 41 days of iMessage + Discord chat history with Ray (his AI COO). The voice the SC website should sound like.

Founder explicitly directed: “How I talk to you would be how I write and the voice I would want to portray.” This profile is the load-bearing input for SC v3 site copy alongside the discovery doc.

Headline voice tell

He sounds like a calm operator thinking out loud while he triages. Short declarative confirmations (“Yes.”, “Okay.”, “Great.”) followed by a numbered breakdown or a single what-about question. Curious more than declarative. Never closes the loop with grand pronouncements. He rolls his own thinking forward by asking the next sharp question instead of stating a thesis. Picture an engineering lead in a Slack thread, not a thought-leader on LinkedIn.

Voice tells

Confirmation opener -> turn
- Half his substantive replies open with a one-word confirmation (“Yes.”, “Okay.”, “Great.”, “Yup.”, “Sure.”, “Yeah.”) and then immediately turn to the next thing. The opener acknowledges; the body redirects.
- Samples:
  - “Great. Approved to go get those perell articles. Also approved for the sanity check repositioning updates.”
  - “Yup. All on the same page. Go for it.”
  - “Okay. Good to know the positioning. Did you see this skill?”
- Build application: open headlines and reply blocks with a one-word affirmation when the prior context warrants it. For lead-in copy, mimic the “[short ack]. [redirect].” rhythm. Don’t pad.
Question-led curiosity, not declarations
- 1034 question marks vs 105 exclamations across the corpus. He poses sharp single questions where most operators would post a take. The questions reveal where he’s genuinely uncertain.
- Samples:
  - “How do we compete with that?”
  - “Where is there existing demand that we could demonetize and dominate with AI?”
  - “Does dry sarcastic humor land in a text only platform?”
- Build application: lean on questions in body copy and section transitions. Sub-heros and CTAs especially. The site should sound like the inside of a curious person’s head, not a vendor pitch.
Numbered/lettered triage
- When he has multiple items to address, he numbers them. “1.” / “2.” / “A.” / “B.” Replies in a list when life hands him a list. Never reformats into prose.
- Samples:
  - “1. Definitely do. 5. Check where these resources and skills landed.”
  - “4A. - I don’t have a particular typography I’m beholden to. 4B. - They can be same parent only by attribution. 4C. - It sounds like you identified the right things to lift.”
  - “Scaffold | Green light all 12 free sources | Yellow light on 3 purchases”
- Build application: where multi-item content shows up (feature lists, weekly drop bullets, what-you-get blocks), use clean enumerated lists with short prose payloads. Don’t try to weave them into paragraphs.
Self-deprecating “idk” hedge
- Lowercase “idk” appears 24x. He uses it not as filler but as a real signal: he doesn’t know, he’s saying so, and he’s still moving forward. Pairs naturally with strategic-thinking moments.
- Samples:
  - “Idk who to interact with now. How do I discover new interesting people?”
  - “Idk if it was a text or iMessage, but I did receive the attachment test”
  - “What am I doing? Am I going to succeed? Damn. Many interests. No path. Like what I’ve setup here is cool. Do I have anything that will actually help people? Idk.”
- Build application: uncertainty in the SC voice is allowed. Even encouraged. The newsletter is a public thinking process, not a finished doctrine. Copy can admit “I don’t know yet” — that’s on-voice.
Tangible-first reasoning
- When he disagrees or pushes back, he leads with a concrete observation, not an abstract principle. “It looks like a vest with a bib and a diaper” before “the design is too childish.” He sees the specific thing first.
- Samples:
  - “A second antenna pops in out of nowhere mid spin and the spin is a bit jerky.”
  - “The yellow gradient in the top section looks nice. The stickers need the background removed. They are all squares instead of the dye cut shape.”
  - “The clue card is not square and looks more like wood than a playing card. It can be more of a white instead of the beige.”
- Build application: every claim in the SC body copy should land on something tangible. No “leverage AI to unlock value.” Show the specific thing. CopyThat’s tangible-vs-abstract test natively matches him.
Walked-in pushback with an out
- When he disagrees with Ray’s framing, he stakes the disagreement, gives a reason, and offers a redirect. He doesn’t just say no. He offers what he’d rather see.
- Samples:
  - “Be careful making MAC into something it’s not. It’s just a testing framework for data pipeline modeling. Ray is much closer to the graph shaped company than MAC.”
  - “I don’t like the ‘pre-consensus’ framing. We are working at the edge of agentic adoption.”
  - “Where did Reddit come from? I don’t look at Reddit and I don’t think we were targeting it as a platform.”
- Build application: where the SC site presents a contrarian take, structure it as “the framing I see -> what’s wrong with it -> what I’d reach for instead.” Don’t just state a contrarian opinion in the abstract.
Dad-life humor undercurrent
- Light, dry, self-aware humor — the kind that fits a dad in his 30s with a mortgage. Never punching up, never trying to be edgy. “Tampa bay Lightning just forced a game 7. Let’s go!” or “GM. How are we doing? Clean restart? What’s new?” He greets like a normal person.
- Samples:
  - “Can you spot the difference between those haha?”
  - “Ooh. What’s a good speed game… for health!?”
  - “I was a gamer growing up, but in my adult dad life stage I haven’t had time for dedicated gaming. I love playing games though.”
- Build application: the SC site can have light texture without being “fun.” The “haha” tic, the casual interjections (“Oh well.”, “Damn.”, “Wow!”) signal a real human, not a brand. Don’t sand them off entirely. Use sparingly in microcopy.

Vocabulary patterns

Words/phrases he uses naturally (keep these):

“Go for it” / “Go ahead” / “Green light” (decisive greenlights)
“Sounds good” / “Sounds reasonable” / “Looks great” (lightweight ack)
“Drop that thread” / “Park it” / “Set aside” / “Skip” (defer/kill)
“What if we…” / “What about…” (curious idea-throw)
“We should” / “Let’s” (collaborative drive forward, not “I will”)
“Idk” (lowercase, mid-sentence, real)
“Targeting system” (his original frame for MAC and Ray)
“Bottleneck” / “forcing function” (operator’s diagnostic vocabulary)
“Hop on this train of thought” (his preferred CTA, per directive)
“Worth/not worth” (cost-consciousness in decisions)

Words/phrases that feel right in published copy:

“Dots to be connected”
“Latent discoveries”
“Thinking out loud”
“Build the muscle”
“What I’m trying to do is…”
“The bet is…”
Plain old “I” not “we” when he’s the one doing the thinking

Words/phrases that would feel WRONG (off-voice):

“Unlock” / “Unleash” / “Empower” — never appear in his samples
“Leverage” — appears once and only as “leverage Claude Design at all” (technical use)
“Disrupt” / “Revolutionary” / “Game-changer” — never
“Cutting-edge” / “Bleeding-edge” / “Next-gen” — never
“Insights” / “Actionable insights” / “Strategic insights” — never
“Solutions” (as a noun for product) — never
“AI-powered” / “AI-native” (used internally to describe MAC, but not as a marketing tag)
“Journey” / “Mission” / “Vision” (in the corporate-uplift sense)
“Synergy” / “Holistic” / “Best-in-class” — not even close
Any em-dash (U+2014) — see anti-patterns below

Sentence structure + rhythm

Sentence length: median 10 words, mean ~12 words. He writes short. Long sentences are exceptions, not the norm.
Punctuation patterns:
- Heavy use of ” - ” (regular hyphen with spaces) for asides — 290 occurrences vs 76 em-dashes (the em-dashes are residue from before he flagged them as AI-tells; honor the no-em-dash rule for site copy)
- Almost zero semicolons (5 total in 1013 messages — effectively never uses them)
- Multiple periods within a single short reply (“Okay. Sounds good on the inline css then. Go for it.”) — three sentences in 12 words
- Smart ellipsis ”…” appears 25x — used for trailing-thought transitions (“What else…”)
- ”…” double-period or triple-dot used for genuine pauses, not for cliffhangers
Opening patterns: confirmation + period + redirect (see Tell #1). When opening a fresh thought, often “Are you able…” / “Do you think…” / “What if…” / “I think…”
Closing patterns: rarely closes. He hands off with another question, a “Thank you”, a “Go for it”, or just stops. No “let me know your thoughts” / no formal sign-off.
Self-correction signals: “Idk”, “Actually”, “Wait”, parenthetical second thoughts. He’ll restate (“That’s actually a preferable route”) rather than delete.
Use of italics / bold / emphasis: minimal. He uses CAPS for occasional emphasis (“PASS / FAIL”, “FAIL”). Doesn’t use markdown bold/italics in chat. Implication for site copy: emphasis should be carried by sentence rhythm and word choice, not formatting tricks.

Conversational patterns

How he opens substantive thoughts: “I think…”, “What about…”, “What if we…”, “Be careful…”, “I’m worried…” — verbs that announce a position is coming.
How he closes / hands off: gracefully but quickly. “Thanks.” / “Thank you.” / “Go for it.” / Question. Almost never a summary.
How he disagrees / pushes back: tangible observation first, then his reasoning, then a redirect. See Tell #6. He says “I take your points though” after reframing.
How he asks for clarification: “What was the [thing] comment?” / “Can you tell me more about [thing]?” / “Where did [X] come from?” — specific, not “can you elaborate.”
How he greenlights: “Go for it.” / “Green light.” / “Approved.” / “Sounds good.” Short. No fanfare.
How he rejects: “Drop that thread for now.” / “Park it.” / “Set aside for now.” / “Skip.” / “Not now.” Soft kill — defers or de-prioritizes more often than hard-rejects.

CopyThat alignment

Natural alignments (his voice already hits these CopyThat principles):

Specificity over generality: he sees the specific antenna popping in, the specific bib-and-diaper outfit, the specific clue card geometry. CopyThat: “tangible beats abstract.” He natively does this.
Conversational over corporate: zero corporate-uplift vocabulary. Reads like a thread, not a deck.
Lead with reader’s situation: when he frames a strategy question, he frames it from the reader’s POV (“How do we compete with that?”, “Where is there existing demand?”). The “you” perspective is implicit because he’s always asking what the reader-or-user would do.
Rhythm + pacing: short sentences, punchy declarative + question pattern. Already CopyThat-rhythmic.
The “so what” test: he naturally asks “what does that unblock?” / “what am I doing?” — he’s running the so-what test on himself in real time.

Aspirational tightening (what would sharpen for publication without forcing):

Hook strength: his first lines are often acknowledgments of context (“Yes. ”, “Okay. ”) which is fine in chat but a weak hook for an article. SC publication-mode should drop the chat-style ack and open with the question or observation directly. Not a voice change — just trim the chat-context ack from the front.
One thought per paragraph: chat-mode lets him stack observations comma-style. Publication-mode can preserve his rhythm by giving each observation its own paragraph or its own line. This is more presentation than voice.
Naming the thing: he often says “this” / “that” / “the thing” referring to context the reader doesn’t have. SC copy should re-anchor pronouns to the actual subject. Not changing voice — just doing the publication-mode work of context-setting.
Closing the loop occasionally: he never closes loops in chat. For publication, the occasional closing line (“That’s why I started writing this.”) earns its keep. Use sparingly — over-closing turns him into a different person.

Anti-patterns (DO NOT do these in copy)

No em-dashes (U+2014). Documented AI-tell per founder feedback memory. Use ” - ” (hyphen with spaces) instead, or rephrase. Even though some em-dashes survive in the chat corpus, the explicit rule is no em-dashes in published output.
No corporate-uplift verbs: unlock, unleash, empower, leverage, disrupt, transform, revolutionize. He never uses these. Site copy that does will read off-voice immediately.
No LinkedIn-thought-leader register: “Three things I learned about X.” / “Here’s what nobody tells you about Y.” / “I used to think X. Then I realized Y.” These are not in his voice. He doesn’t write self-mythologizing setup-payoff posts.
No SaaS marketing voice: “Built for [persona].” / “The all-in-one platform for [thing].” / “Stop doing X. Start doing Y.” Out.
No declarative-essay voice: he doesn’t write “AI is changing how we think about data.” He writes “How do we compete with that?” — questioning, not declaring.
No “we” when “I” is honest: SC is a personal newsletter. Royal-we and corporate-we don’t fit. He uses “we” only when he’s coordinating with Ray (his agent). For the website, default to first-person singular when it’s his voice.
No Cook/Chef vocabulary, no “alpha + reps”, no “chef-derived”, no “targeting-systems-component-library”, no “kill-thesis” frame in user-facing copy. These are internal framings. They don’t belong on the public site. (Hard constraint per task spec.)

Hero / sub-hero / CTA drafts

The current hero “Reps build the moat…” doesn’t land. Drafted alternatives below. Confidence-ranked within each block (1 = strongest match to voice + positioning, 5 = weakest but still on-voice).

Homepage hero (5 alternatives)

Sanity Check is what I’m thinking out loud about, before the consensus catches up. (highest confidence — captures the questioning register, the “thinking out loud” he uses naturally, and the pre-consensus angle without using the off-voice phrase “pre-consensus” which he flagged.)
A working data engineer asking the questions everyone else is too busy to ask. (strong — uses his question-led register, identifies the persona, doesn’t oversell.)
What if we’re all wrong about what data work becomes next? (strong — opens with his characteristic “What if we” pattern, frames a real bet rather than a claim.)
Notes from the edge of agentic adoption, written by someone still figuring it out. (good — “edge of agentic adoption” is his exact framing from the Apr 24 reframe; “still figuring it out” honors his idk-tic.)
A newsletter for people who’d rather ask the next sharp question than post the next confident take. (decent — captures the contrarian register but slightly long.)

Sub-hero supporting copy (5 alternatives, 1-2 sentences max)

No takes I haven’t actually run. No reframes I can’t defend with the work. (highest — paired declarative-rhythm, “actually run” is his tangible-first move, both lines under 12 words.)
Free to read. The bet is that thinking carefully in public is its own reward. (strong — “the bet is” is on-voice, “thinking carefully in public” matches his self-conception.)
Written from the data side. Read by anyone watching their work get rebuilt around agents. (good — names the persona, names the audience, no jargon.)
One issue when the thinking is done. Skip the weeks where I’m still chewing on it. (good — admits cadence honestly, on-voice with “chewing on it” register, but slightly cute.)
Plain prose. Specific examples. The same questions I’m asking my own AI agent every week. (good — three short sentences, tangible promise, ties to the “talking to Ray” backstory if it fits the site arc.)

CTA primary (5 alternatives)

Founder leaning toward “Hop on this train of thought.” Below: that one ranked + four alternatives.

Hop on this train of thought (founder’s preferred — keep as primary unless one below tests stronger. On-voice for sure: idiomatic, lightly playful, no command-verb pressure.)
Get the next one when it’s ready (strong — admits the cadence is irregular, no “subscribe to get weekly updates” energy, sounds like him.)
Read along (simple — fits the “thinking out loud” frame; companionable rather than transactional.)
Subscribe. Or don’t. (playful — risky if it reads as flippant, but it’s on-voice for his dry register and confident-enough-to-not-care energy.)
Send me the next one (reader-led framing — sub-hero would need to support this with first-person-singular voice from the founder.)

Sample passage in his voice

A 200-word about-page-style passage demonstrating the synthesized profile. Build subagent’s reference exemplar.

I write Sanity Check because I keep finding myself in the same conversation.

It usually starts when someone asks me what I think about AI and data work. The honest answer is: idk yet, but I have some observations. Then we end up on a thread that goes for an hour, and I leave thinking the conversation deserved to be a piece of writing instead.

So Sanity Check is that. A working data engineer thinking out loud about what’s actually happening on the ground. What’s hype, what’s quietly shipping, what I’m betting wrong on, what I had to walk back.

I’m not pre-mortem’ing my career or building a thought-leadership brand. I’m trying to think carefully about a thing I work in, in public, with people who’d rather ask the next sharp question than post the next confident take.

If you’re a working data person watching your job get rebuilt around agents, you’re in the right place. If you came here for hot takes, you’ll be bored. The takes I publish are takes I’ve actually run. The reframes I publish are reframes I can defend with the work.

Hop on this train of thought.

This passage demonstrates: Tell #1 (Confirmation opener -> turn isn’t applicable for an opener-paragraph but the rhythm-by-period-clusters is preserved), Tell #2 (curiosity register, questions implicit), Tell #4 (“idk yet”), Tell #5 (tangible-first, the “actually run” / “defend with the work” closing), Tell #6 (what’s wrong with framing X / what I’d reach for instead), short sentences (median 10-12 words), no em-dashes, no corporate-uplift verbs, no Cook/Chef vocabulary.

Open notes for build subagent

Hard constraint: NO Cook/Chef vocabulary, NO “alpha + reps”, NO “chef-derived”, NO “targeting-systems-component-library”, NO “kill-thesis” frame in user-facing copy. The internal-thinking vocabulary is OFF-LIMITS for site copy.
Hard constraint: NO em-dashes (U+2014). Use ” - ” (regular hyphen with spaces) or rephrase. The chat corpus has em-dashes from before founder flagged the AI-tell; the rule is now strict for output.
Hard constraint: NO corporate-uplift verbs (unlock, unleash, empower, leverage in marketing sense, disrupt, transform).
Voice over framework: when in doubt, pick the version that sounds more like the chat samples and less like the v3-positioning doc. The positioning doc is internal scaffolding; the voice profile is the public-facing register.
Length bias: short sentences. Median 10 words. If a sentence runs past 18 words, see if it can become two.
Question density: site copy should have a higher question-density than typical SaaS sites. He thinks in questions. Section transitions and CTAs especially can use questions.
First person singular vs plural: SC v3 is the founder’s personal newsletter. Default to “I” not “we” for the site copy. Reserve “we” for moments when he’s including the reader (“How do we compete with that?”) or when actual coordination with the operating apparatus is on-screen.
Presence of “idk”: at least one moment in the site should show real uncertainty. “Still figuring it out.” “Idk yet.” That’s on-voice, not a weakness.
Numbered lists: when listing what the newsletter is or isn’t, what readers get, etc. — use clean enumeration, not woven prose. He triages in lists.
Tangible always over abstract: every benefit-claim should be paired with a tangible artifact, example, or specific noun. CopyThat’s tangible-vs-abstract test is the single most important filter to run on every line.