Moonshots EP 198: The AI War — OpenAI Ads, Sora 2, Grok, and Google’s Ad Business at Risk
Summary
Weekly roundup with Diamandis, Blundin, Wissner-Gross, and Salim Ismail covering the acceleration of AI-generated media and the frontier model arms race. The episode opens with Meta’s Vibes app for AI-generated video — notably built on third-party models (Midjourney, Black Forest Labs) rather than Meta’s own, which Wissner-Gross reads as a signal the space moves too fast even for $600B-budget companies. Sora 2 gets an extended segment: Diamandis demos his own AI-generated videos (himself on the moon, pumping 500lbs in a gym, interviewing Sam Altman) and walks through the creation process. Wissner-Gross notes Sora 2 has real-world physics consistency and argues video models will merge into autoregressive transformer architectures, enabling entirely new classes of reasoning — “if I ask you to think of a pink elephant, you visualize it; video models will give AI that ability.” Suno 5 passes the musical Turing test. The Claude Sonnet 4.5 segment is the richest: Wissner-Gross reports it working 30+ hours autonomously (up from 7 hours in prior models, up from 1 hour before that), suggesting hyperexponential rather than exponential autonomy growth. Anthropic’s “Imagine” feature generates apps via just-in-time code generation on every click event — Wissner-Gross predicts this ends with “every single pixel being generated.” He raises the orthogonality thesis question around Anthropic’s 10x reduction in lying/power-seeking: is power-seeking instrumentally convergent above some intelligence threshold, or can it truly be decoupled?
Key Segments
- [00:01-05:00] Meta Vibes app — AI video generation via third-party models, shift from content selection to content generation
- [08:00-17:00] Sora 2 demos, Suno 5 music generation — musical Turing test passed, free tools, voice-only creation interface
- [18:00-23:00] Claude Sonnet 4.5 — 30+ hour autonomous work sessions, 10x reduction in lying/power-seeking, SWE-Bench 82%
- [23:00-27:00] Imagine (Claude) — just-in-time code generation per UI event, “every pixel will be generated” prediction
Notable Claims
- Meta used Midjourney and Black Forest Labs for Vibes rather than in-house models despite $600B budget
- Claude Sonnet 4.5 autonomy: 30+ hours straight (hyperexponential growth from 1hr -> 7hr -> 30+hr)
- Anthropic reduced lying and power-seeking by 10x in Sonnet 4.5
- Suno 5: generates studio-quality 8-minute songs; musical Turing test effectively passed
- Sora 2 and all generation tools are free — demonetization of creative production
- Blitzy hit 86.8% on SWE-Bench by combining multiple models (vs Sonnet 4.5 at 82% alone)
Bias/Framing Notes
Bill Gates speech recognition anecdote is used to frame current AI as “magical” without acknowledging that decades of foundational research enabled current models. Blitzy gets prominent placement (Blundin connection). The “we’re living in the singularity” framing continues. Sora 2 demo is essentially self-promotional content creation by the host.