06-reference

dwarkesh what are we scaling

Mon Dec 22 2025 19:00:00 GMT-0500 (Eastern Standard Time) ·reference ·source: Dwarkesh Patel (YouTube) ·by Dwarkesh Patel
ai-scalingcontinual-learningagi-timelinesrl-vs-imitationdwarkesh

“What are we scaling?” — Dwarkesh Patel

Episode summary

A solo essay narration where Dwarkesh argues that current short-AGI-timeline beliefs are internally inconsistent with the labs’ actual behavior of pre-baking skills via mid-training and RLVR (RL from verifiable reward). His thesis: if a humanlike learner were near, none of this elaborate skill-baking would be necessary. The mismatch between how labs invest and what would be needed for genuine AGI is the giveaway. Continual learning, not pure compute scaling, is the real bottleneck — and it likely takes 5-10 more years to crack at human level.

Key arguments / segments

Notable claims

Guests

Solo essay (no guests). References:

Mapping against Ray Data Co

Strong alignment with Sanity Check editorial stance: this essay is the most clearly RDCO-positioned Dwarkesh essay in the recent run. The thesis “models keep getting more impressive at short-timelines pace but more useful at long-timelines pace” is a near-perfect Sanity Check headline candidate.

Specific connections:

Sanity Check candidate hook: “The AGI tell isn’t the demos. It’s the supply chain of PhDs that the labs are paying to write training questions.”