We Trained an AI on a Board Game. It Became a Better Customer Support Agent.
Alex Duffy (cofounder/CEO of Good Start Labs) describes how fine-tuning the Qwen3-235B model on thousands of rounds of the board game Diplomacy produced unexpected cross-domain improvements. The model showed 10%+ improvement on other games (Hanabi, Wordle) but also improved on Tau2 (customer support benchmark) and AssetOpsBench (IBM’s industrial operations benchmark).
The mechanism: games reward specific behaviors (context-tracking, shifting priorities, strategic communication) that transfer to real-world tasks. Diplomacy specifically trains persuasion and strategy without relying on luck, making it a rich environment for learning sequential decision-making.
The broader argument is that AI training is shifting from static data ingestion (predict the next word) toward reinforcement learning in environments with goals and feedback. This produces something that resembles strategy rather than recall.
RDCO mapping: The cross-domain transfer finding has implications for how we think about AI agent training. Rather than narrowly training agents on specific tasks, foundational capabilities in reasoning and context-tracking may transfer broadly. The game-as-training-environment concept could inform how we design feedback loops for operational agents.