Moonshots EP 228: The Frontier Labs War — Opus 4.6, GPT 5.3 Codex, and the SuperBowl Ads Debacle

Summary

The episode opens with Anthropic’s surprise release of Claude Opus 4.6 (rumored to be a rebranded Sonnet 5), which the panel calls a “feel the AGI moment.” Alex highlights the C compiler feat — Opus 4.6 agent teams wrote a working multi-architecture C compiler in Rust for $20K of API calls, then used it to compile a Linux kernel, demonstrating productionized recursive self-improvement. The panel debates frontier lab differentiation (Anthropic on codegen, OpenAI on platform, Google on pre-training corpus, xAI on benchmaxing) but concludes they’re converging. A major segment covers Opus 4.6 discovering 500+ high-severity zero-day vulnerabilities in open-source code, leading to a security discussion about AI-vs-AI cyber warfare and whether 2026 will see a major infrastructure incident. The episode also covers GPT-5/Codex developments, OpenAI’s ChatGPT market share decline, AI-driven cell-free protein synthesis (GKO Bioworks + OpenAI), and a privacy segment.

Key Segments

[00:06-00:19] Opus 4.6 deep dive: benchmarks, C compiler achievement, agent team mode, pricing efficiency, 20+ hour autonomy time horizons
[00:19-00:25] Zero-day vulnerability discovery: 500+ bugs found, cybersecurity implications, AI-vs-AI warfare prediction, crypto vulnerability risk
[00:25-00:29] Science factories: GKO Bioworks + OpenAI closed-loop protein synthesis, AI supervising the scientific method

Notable Claims

Opus 4.6 may actually be a distilled/rebranded Sonnet 5, which would explain its lower cost
Autonomy time horizons are following a “hyper-exponential” curve, possibly exceeding 20 hours for Opus 4.6
Meror (data gathering company) reportedly hit $1B revenue run rate
Sem reports noticeably lower API costs with Opus 4.6 vs 4.5 for equivalent agent workloads

Guests / Panelists

Peter Diamandis (host), Alex Weiszner-Gross, Dave (DB2), Salem Ismail

RDCO Mapping

Intelligence cost collapse: The C compiler case ($20K vs person-decades) is a concrete hyperdeflation data point for Sanity Check content
Recursive self-improvement: Opus 4.6 compiling its own tech stack is the clearest production example yet — strong newsletter angle
Cybersecurity: The zero-day discovery segment maps to operational security concerns for our own agent infrastructure
Science factories: The closed-loop experiment paradigm aligns with vault concept articles on AI-driven research