06-reference

dwarkesh dario amodei end of exponential

Thu Feb 12 2026 19:00:00 GMT-0500 (Eastern Standard Time) ·reference ·source: Dwarkesh Patel (YouTube) ·by Dwarkesh Patel + Dario Amodei
ai-scalinganthropicdario-amodeiexponentialrl-scalingsoft-takeoffin-context-learningagi-timelinesdwarkeshbig-blob-of-compute

Dario Amodei — “We are near the end of the exponential” (Dwarkesh Patel)

Why this is in the vault

This is the most consequential Dwarkesh interview of 2026 so far because the headline is the opposite of what every Anthropic-incentive-aligned reader would predict. The CEO of an AI lab whose entire valuation depends on continued scaling is on record saying we are near the end of the exponential — and saying it specifically because public discourse is failing to recognize how close we are to a regime change. This is the most useful single source for any RDCO position that involves “what does the next 3-5 years actually look like, separating capability claims from product reality.”

The other reason: Dario engages directly and at length with the same Sutton/Karpathy line we’ve been building positions on. We now have all three principals (Sutton, Karpathy, Dario) on tape disagreeing about the same question — “is RL scaling actually getting us anywhere new, or are we just polishing imitation?” — and Dario is the one who thinks the disagreement doesn’t matter.

The core argument

The exponential is roughly on schedule, but ending soon. Dario’s update over three years: the underlying technology has gone “about as I expected it to go,” plus or minus a year. The model march from smart high schooler to PhD-level professional has happened on the curve. What surprised him is the lack of public recognition that we’re near the end of it — people are still arguing about hot-button political stuff while a regime change is closing in.

Big Blob of Compute Hypothesis. This is the 2017-vintage Anthropic-internal frame Dario has held throughout. Seven things matter: (1) raw compute, (2) data quantity, (3) data quality/distribution, (4) training time, (5) an objective function that scales arbitrarily (pre-training loss, RL reward), (6-7) normalization/conditioning for numerical stability. Everything else is a footnote. Pre-training scaling worked because the objective function was scalable. RL scaling is now working for the same reason — they’re seeing log-linear improvement in RL on math, code, and a “wide variety of RL tasks” with compute spend. He claims RL scaling is operationally identical to pre-training scaling, just on a different objective.

The Sutton/Karpathy critique doesn’t matter. Dwarkesh runs the standard line: if RL needs bespoke environments to learn Excel, we’re missing the core human-learning algorithm. Dario’s response: yes, there’s a genuine puzzle there, but it probably doesn’t matter. The pre-training distribution-broadening trajectory (from fanfic in GPT-1 to the entire internet in GPT-4) is doing the same thing as what humans would call “general learning.” Whether the model has a “real” learning algorithm is academic if the empirical curves keep delivering capability.

Soft takeoff, smooth exponential. Repeated multiple times. Dario’s takeoff theory is not “discontinuous AGI moment.” It’s that current curves keep going (10% productivity uplift becomes 20%, becomes 25%, becomes 40%) while Amdahl’s-law deployment friction gradually clears. Snowball, not avalanche. He uses “we’re at a 15-20% total factor speedup right now from internal coding models, was 5% six months ago” as the calibration data point.

Computer use as the deployment bottleneck. The honest answer to “when will AI do my video editor’s job” is: when computer use is reliable. OSWorld benchmark went from 15% to 65-70% in 15 months. Real-world reliability needs to come up from there. He thinks the country-of-geniuses-in-a-data-center model can do the video-editing job once computer use is solid.

On the productivity-paradox / METR study. Dwarkesh asks the obvious question: if these models are so productive, why does the controlled developer study show a 20% downlift? Dario’s answer is two-fold. (1) Inside Anthropic, with massive commercial pressure, they would absolutely cut these tools if they were a net negative — they are unambiguously a productivity win at frontier-lab scale. (2) The world-level evidence is the model-launch cadence: nobody is running away with a permanent lead because each lab is pulling within months of each other. He concedes the productivity gain is small (was 5%, now 15-20%) and just now starting to compound.

On-the-job learning is replaced by long context + pre-baked skills. Dwarkesh keeps pushing on the “I hire someone, six months later they’re a powerhouse” point. Dario reframes: with coding, the codebase IS the learned context, and a 1M-token window can ingest it in seconds. So the “learning on the job” is collapsed into context loading. He concedes in-context learning is “weaker and shorter-term” than human on-the-job learning but bets continued scaling closes that gap.

Mapping against Ray Data Co

This is the clearest articulation of the “soft takeoff” view we’ve encountered. Soft takeoff is a friendlier-to-RDCO scenario than the AI-2027 hard-takeoff frame: it means the next 3-5 years is a steady compounding of capability against gradually-clearing deployment friction, not a discontinuity. That’s a window where data quality, integration, governance, and context engineering all monetize. Hard-takeoff makes everything pre-takeoff irrelevant; soft-takeoff makes everything pre-takeoff load-bearing.

Specific newsletter ammunition:

Where Dario is most likely wrong (and we should publish dissent):

Sanity Check candidate hooks:

Open follow-ups