Deepmind

Breaking Tasks into Milestones: DeepMind's Fix for Long-Horizon Agent Failure March 23, 2026
Long-horizon LLM agents fail in predictable ways: they loop, drift, and lose the thread. A new Google DeepMind paper proposes subgoal decomposition at inference time combined with milestone-based RL rewards, and the numbers are striking.