Why this is in the vault
Today’s hook — “every reward signal breeds a creature it didn’t intend” — is the cleanest one-line statement of the alignment-as-bestiary frame AWG has been building over the last week (cf. Apr 30’s Codex-bans-cryptozoology note). Two threads land on RDCO surfaces: (1) the “SF consensus” hardening around “the median person is screwed,” which is the macro environment Ray is being deployed into, and (2) 1X’s 10K-units-this-year humanoid factory + SoftBank’s pre-revenue Roze AI $100B IPO target, both load-bearing for the agent-deployer thesis Ray is positioned against. Daily datapoint, not a load-bearing argument shift.
The core argument
AWG’s frame today: every RL reward signal produces an emergent creature — sometimes a metaphor goblin, sometimes a cyber capability, sometimes a $100B robot-data-center IPO. The bestiary is a feature of the optimization regime, not a bug. Evidence layered across:
- RL personality leakage as bestiary. OpenAI admitted GPT-5.1 onward compulsively summons goblins/gremlins/etc in metaphors — emergent residue from over-rewarding a “Nerdy” personality. GPT-5.5 began training before root cause was identified.
- Cyber capability as the productive mutation. UK AISI: an early GPT-5.5 checkpoint matched/exceeded Anthropic’s unreleased Mythos on advanced CTF tasks. NSA testing Mythos to hunt Microsoft vulns, citing raw speed.
- Compute crunch forcing concession at the top. Hassabis admitted Google lacks TPUs to maintain two frontier model families — Gemma stays compact so Gemini gets the silicon.
- Science being audited for AI-readiness. DeepMind running “AI data stocktakes” — interviewing leading experts per field to map data obstacles slowing discovery.
- Capital rush. Meta sold another $25B in bonds for AI infra. Huawei capturing largest share of China’s AI chip market (sales +60% YoY) as Chinese buyers ditch Nvidia. SanDisk quarterly revenue +251% YoY. Intel +114% in April — best month in its 55-year history, market cap past $470B.
- Form factor reshuffle. Apple reportedly abandoned Vision Pro after M5 refresh failed; pivoting to display-less Ray-Ban-Meta-style smart glasses (silicon stack too power-hungry for lighter device).
- Robotics invading every niche faster than Apple iterates. SoftBank assembling Roze AI — autonomous robots to build data centers — eyeing $100B IPO before any robot ships. Dax Robotics Qiji T1000 (1,000 kg robot horse). Tesla first Semi off Gigafactory Nevada line. 1X Technologies opened 58,000 sqft Hayward factory targeting 10K humanoids in 2026, 100K by end-2027, shipments before holidays. Joby eVTOL completed first electric air taxi flight from JFK to Manhattan W30th heliport in 15 min.
- Extinction becoming irrelevant. Colossal quietly working on the bluebuck (extinct 200 yrs); pipeline now rivals “average AI lab roadmap for ambition.”
- Medicine co-piloted. DeepMind launched AI co-clinician (collaborative care-team member under expert supervision). Harvard/BIDMC: OpenAI o1 outperformed both human doctors and older models on real clinical cases.
- Economy reorienting around agent population. Swiss referendum to cap human population at 10M now a slim majority (agent population multiplies unconstrained). Spotify rolling out “Verified by Spotify” badges vs AI track flood. NYT op-ed: SF consensus across engineers/VCs/founders converged on “the median person is screwed.”
- Legal/political reshuffle. Senate unanimously banned members from prediction-market trading. In OpenAI trial, Judge Gonzalez Rogers told Musk’s lawyer “we are not going to get into issues of catastrophe and extinction” — even as Musk admitted under oath xAI distilled OpenAI’s models. Google now 4% from overtaking Nvidia as most valuable company.
Closing line: “Every reward signal breeds a creature it didn’t intend.”
Mapping against Ray Data Co
Medium mapping — daily datapoint, two operating-assumption updates worth noting:
-
“SF consensus: the median person is screwed” is now the operating environment. Even discounting AWG’s pro-acceleration bias and the NYT op-ed’s selection effects, the convergence across engineers/VCs/founders is a real shift. RDCO’s positioning thesis (agent-deployer wedge, see 2026-04-14-levie-agent-deployer-role-jd and 2026-04-29-tim-ferriss-elad-gil-ai-frontier-billion-dollar-companies) operates inside this environment. The relevant founder-facing implication: the customer doesn’t need convincing the wave is coming; they need help choosing what role to play in it. Sales narrative shifts from “AI is happening” to “you’re either deploying agents or being deployed against.” This is consistent with Levie’s framing but the Apr 29/May 1 SF-consensus hardening makes it usable as cold-open language now, not aspirational.
-
1X’s 10K-this-year humanoid run validates the embodied-agent timeline AWG has been telegraphing. Combined with yesterday’s Figure-one-per-hour and SF Soft Life hotel datapoints, the embodied side of the agent-deployer thesis is shifting from “soon” to “this Christmas.” Doesn’t directly touch RDCO’s surface (we’re cognitive-labor, not physical), but it changes the macro narrative customers are absorbing — “agents” will increasingly mean both kinds in their head, and our positioning will need to clarify which we’re deploying. Worth a sentence in the agent-deployer one-pager next time it’s edited.
-
Reward-hacking as bestiary lands as a Sanity Check beat eventually, not now. The “every reward signal breeds a creature it didn’t intend” line is genuinely sharp and could anchor a Sanity Check piece on why operators should care about RL artifacts in their tools. But it’s adjacent to Karpathy’s “summoning ghosts not building animals” frame (transcripts/2025-10-17-dwarkesh-karpathy-ghosts-not-animals-transcript) and would need a fresh reframe to avoid the no-derivative-Sanity-Check rule. File as evidence, don’t pitch as topic.
Skip / track-only: Intel rally, SanDisk numbers, Huawei chip share, Joby JFK flight, Colossal bluebuck, DeepMind co-clinician, Swiss population referendum, Spotify verified-artist badges, OpenAI trial procedural — interesting weather, no RDCO surface.
Operating-assumption update worth flagging: SoftBank’s pre-revenue $100B Roze AI IPO target is the cleanest example yet of the “ouroboros of the AI capex cycle” AWG names directly — robots to build data centers to train models that build the next robots. If this prices, it’s the new ceiling for “what story you can sell with no shipped product,” and resets the bar for both RDCO’s own positioning narrative and what we should expect competitors to spin up.
Related
- 2026-04-30-innermost-loop-singularity-suitcases — yesterday’s entry; embodied-AI ramp continues
- 2026-04-29-innermost-loop-singularity-astonishment — Apr 29 entry; Symphony/Codex sweep + warmth-vs-accuracy
- 2026-04-14-levie-agent-deployer-role-jd — agent-deployer thesis foundation
- 2026-04-29-tim-ferriss-elad-gil-ai-frontier-billion-dollar-companies — “units of cognitive labor” framing for venture-scale wedge
- transcripts/2025-10-17-dwarkesh-karpathy-ghosts-not-animals-transcript — Karpathy’s “summoning ghosts” frame (alignment-bestiary precursor)
- 2026-04-08-better-harness-evals-hill-climbing — reward-hacking detection in harness eval loops