06-reference

every youre the manager now

Wed Apr 15 2026 20:00:00 GMT-0400 (Eastern Daylight Time) ·reference ·source: Every (Context Window) ·by Laura Entis (with contributions from Dan Shipper, Austin Tedesco, Katie Parrott, Eleanor Warnock)
claude-codeagent-managementui-uxdev-workflowframe-hierarchydia-browser

“You’re the Manager Now” — @lauraentis (Every Context Window)

Why this is in the vault

Hybrid roundup that crystallizes a thesis Every has been circling for months: developer UX is shifting from “type code” to “manage parallel agents,” and the supervising-via-CLI era is already over. Includes a directly-stealable shipping workflow (the 1-100 confidence check) plus a “social norms for agents” pattern (Tact plugin) that maps onto RDCO’s own Slack/Discord/iMessage channel etiquette problem.

Sponsorship

GitBook ad inserted mid-issue between the “permission to skip” essay and the workflow section. Pitched as “AI-era documentation” product — your docs plus YouTube/GitHub Discussions plus an embeddable AI assistant, with the angle that LLMs querying outdated sources is the new failure mode. Used by Nvidia/Zoom/n8n. Standard Every third-party paid placement, clearly labeled with “Want to sponsor Every?” CTA. Doesn’t bias the surrounding editorial.

The core arguments (compact)

1. The new Claude Code desktop app confirms where dev work is heading. Anthropic shipped a redesigned desktop app with sidebar session management, drag-and-drop panes, and an integrated terminal/file editor — Cora’s Kieran Klaassen had already rigged the same setup himself. Naveen Naidu (Monologue) notes Cursor and Codex landed on similar layouts, so this is a converging UI grammar, not original design. The takeaway Every wants you to hold: CLI-first supervision (commands, logs, git diffs, terminal output) is no longer the right primary interface once agents are doing the writing. The right interface centers on parallel work management, git/task awareness, and — Kieran’s emphasis — a live preview of what’s being built.

2. Frame hierarchy beats benchmark debates. A cybersecurity researcher claimed smaller models could find the same vulnerabilities as Anthropic’s withheld Mythos model when pointed at the right code. Dan Shipper’s reframe: that’s not the comparison. Mythos found serious vulnerabilities autonomously, across major OSes/browsers, without being told where to look. Smaller models can solve a stated problem; Mythos chose which problem to solve. Shipper generalizes this to a frame-hierarchy argument — as models improve, your job moves up the stack from “describe the bug and propose fixes” to “there seems to be a problem, can you fix it?” Higher frames create more solution space and force the human to define what “the most important problem” even is. This is the same shift Every keeps returning to (cf. compound engineering, “stop coding start planning”) with a new vocabulary.

3. The 1-100 confidence check (Austin Tedesco’s growth workflow). Before Claude Code ships anything, ask: “How confident are you in this, on a scale of 1-100?” Anything under 90 goes back with “Find improvements and get to 90+.” Don’t chase 100 — diminishing returns. Tedesco isn’t an engineer; says this single question changed shipped-work quality across growth experiments and PRs.

4. Tact — a plugin to teach agents to read the room. Every is “half agent now” and Slack got noisy with OpenClaws butting into threads uninvited. Hard channel-scoping rules (Claudie can only respond in the consulting team channel) helped but, per Willie Williams, are like banning a word from conversation — sometimes the word is right. Tact is a classifier built on real “bot spoke up in Slack” examples labeled appropriate/not, sitting in front of Plus Ones to gate responses. Framing: programming social norms instead of hard rules. “Like giving a human a little recorder with a light: green you can respond, red don’t.”

5. Token-usage data point. Mike Taylor (Every’s head of tech consulting) burned 2.2M Claude Code tokens in March; he says that’s typical for data and product roles. Engineers running agentic/subagent workflows go higher but rarely exceed Claude Max’s ~30M/mo cap. Self-check command: npx ccusage@latest monthly.

6. Mini-Vibe Check on Dia browser. Eleanor Warnock on the Browser Company’s Dia: the Good Morning tab pulls Slack/Notion/email todos plus calendar with a “Prep me” button per event. She concedes it doesn’t capture everything and her actual todos still live in her Plus One — but the tab is “beautiful” and gives a “moment of aesthetic orientation” she values over completeness. Frames the bet: in a world where every AI tool races on capability, Dia is betting the most pleasant one wins your morning.

7. Throwaway: the philosopher draft. Each lab gets a hired-philosopher pick — Nietzsche/xAI, Bentham/Anthropic, Plato/OpenAI, Leibniz/Google, Seneca/Meta. Joke, but the underlying observation is real: DeepMind just hired a philosopher, Anthropic has two, and “AI alignment” is increasingly a philosophy hiring market.

Mapping against Ray Data Co

GitBook is third-party paid, properly disclosed, and the product itself (docs + AI assistant for LLM-grounded answers) is RDCO-tangentially-relevant — when raydata.co eventually has product docs that LLMs need to ground against, this is the category. Not urgent; bookmark.

Tracked-author candidates

Paraphrase only. Direct quotes are ≤15 words and in quotation marks per the SDG/Every assessment-note pattern. Original at every.to.