Moonshots EP 146: DeepSeek vs. OpenAI — The State of AI w/ Emad Mostaque & Salim Ismail
Summary
A WTF tech episode recorded in the immediate aftermath of DeepSeek R1’s release, with Emad Mostaque (ex-Stability AI founder, now CEO of Intelligent Internet) and Salim Ismail providing technical analysis. Emad, who built one of the world’s top-10 supercomputers at Stability AI, validates DeepSeek’s claims as technically credible: the 2,000 H800 GPU training run, the $1.2M cost, and the engineering innovations (mixture-of-experts with 640B parameters but only 30B activated, PTX-level code optimization, memory scaling over compute speed scaling) all check out given China’s constraint-driven innovation culture.
Key technical insight from Emad: the real breakthrough was not distillation from OpenAI (though some internet data contamination is inevitable) but the AlphaGo Zero-style reinforcement learning approach where R1-Zero created its own synthetic training data. The Chain of Thought reasoning being visible (vs. OpenAI hiding it in o1) was a major factor in the viral adoption. His most striking projection: by next year, o1-level reasoning models will run on smartphones at 20 watts — equivalent to human brain power consumption — making the current data center energy crisis narrative partially obsolete for inference.
The Nvidia impact analysis is nuanced: while the market panicked (Nvidia down ~17% on announcement day), Emad argues overall AI demand is so explosive that chip demand won’t materially decline. He breaks down Nvidia’s GB300 NVL72 integrated boxes: 72 chips with inter-chip bandwidth equivalent to the entire internet, $3M per box, and estimates DeepSeek’s model could be trained on 4-10 of these boxes at a total energy cost of ~$20K. Emad also previews his new company Intelligent Internet, building an open-source AI stack for regulated industries where you need to know exactly what’s inside the model — citing Anthropic’s “sleeper agents” paper showing that a few thousand poisoned words out of 10 trillion training tokens can fundamentally alter model behavior.
The conversation frames four emerging AI tiers: super-expert AGI (called on demand), personal AI (Apple/Google), open-weight models (DeepSeek/Llama, useful but risky for regulated industries), and fully open-source/open-data AI for decision-support systems requiring transparency.
Bias/Sponsor Notes
Standard Diamandis ad reads. Emad is promoting Intelligent Internet. He has obvious motivation to validate DeepSeek’s approach since it aligns with his open-source thesis and undermines the closed-source companies he now competes against. Diamandis discloses early xAI investment. The episode was recorded in the heat of the moment, so some claims about DeepSeek’s capabilities reflect day-one enthusiasm rather than settled analysis.