ThinkRL-Edit teaches an image editor to think first and draw second, which makes tricky, reasoning-heavy edits much more accurate.
This paper shows that training a language model with reinforcement learning on just one super well-designed example can boost reasoning across many school subjects, not just math.
Falcon-H1R is a small (7B) AI model that thinks really well without needing giant computers.
Large language models (LLMs) don’t act as a single brain; inside, each layer and module quietly makes its own mini-decisions called internal policies.
Robust-R1 teaches vision-language models to notice how a picture is damaged, think through what that damage hides, and then answer as if the picture were clear.
The paper asks when reinforcement learning (RL) really makes language models better at reasoning beyond what they learned in pre-training.
COOPER is a single AI model that both “looks better” (perceives depth and object boundaries) and “thinks smarter” (reasons step by step) to answer spatial questions about images.