Small AI models often stumble when a tool call fails and then get stuck repeating bad calls instead of fixing the mistake.
This paper asks a new question for vision-language models: not just 'What do you see?' but 'How far along is the task right now?'
Diffusion language models can write tokens in any order, but that freedom can accidentally hurt their ability to reason well.
Robots used to explore by following simple rules or short-term rewards, which often made them waste time and backtrack a lot.
KAGE-Bench is a fast, carefully controlled benchmark that tests how well reinforcement learning (RL) agents trained on pixels handle specific visual changes, like new backgrounds or lighting, without changing the actual game rules.
The paper introduces Intervention Training (InT), a simple way for a language model to find and fix the first wrong step in its own reasoning using a short, targeted correction.
This survey explains how to make AI agents not just smart, but also efficient with their time, memory, and tool use.
The paper solves a big problem: when you merge several reinforcement-learned models, their special skills get watered down by simple averaging.
Think3D lets AI models stop guessing from flat pictures and start exploring real 3D space, like walking around a room in a video game.
This paper teaches video-making AIs to follow real-world physics, so rolling balls roll right and collisions look believable.
RL-trained search agents often sound confident even when they don’t know, which can mislead people.
This paper is the first big map of how AI can fix real software problems, not just write short code snippets.