The paper trains language models to solve hard problems by first breaking them into smaller parts and then solving those parts, instead of only thinking in one long chain.
This paper teaches a model to make its own helpful hints (sub-questions) and then use those hints to learn better with reinforcement learning that checks answers automatically.
This paper shows a simple way for AI models to keep learning new things without forgetting what they already know.
Diffusion language models can write tokens in any order, but that freedom can accidentally hurt their ability to reason well.
AgencyBench is a giant test that checks how well AI agents can handle real, long, multi-step jobs, not just short puzzles.
The paper introduces Multiplex Thinking, a new way for AI to think by sampling several likely next words at once and blending them into a single super-token.