This paper teaches language models to be safer, more factual, and higher quality during pretraining, not just after, by using reinforcement learning with a stronger model as a helper.
Innovator-VL is a new multimodal AI model that understands both pictures and text to help solve science problems without needing mountains of special data.
SimpleSeg teaches a multimodal language model to outline objects by writing down a list of points, like connecting the dots, instead of using a special segmentation decoder.
AdaReasoner teaches AI to pick the right visual tools, use them in the right order, and stop using them when they aren’t helping.
This paper teaches code AIs to work more like real software engineers by training them in the middle of their learning using real development workflows.
Endless Terminals is an automatic factory that builds thousands of realistic, checkable computer-terminal tasks so AI agents can practice and improve with reinforcement learning.
This paper shows how to keep training a language model while it is solving one hard, real problem, so it can discover a single, truly great answer instead of many average ones.
This paper asks a new question for vision-language models: not just 'What do you see?' but 'How far along is the task right now?'
KAGE-Bench is a fast, carefully controlled benchmark that tests how well reinforcement learning (RL) agents trained on pixels handle specific visual changes, like new backgrounds or lighting, without changing the actual game rules.
The paper introduces Intervention Training (InT), a simple way for a language model to find and fix the first wrong step in its own reasoning using a short, targeted correction.
This survey explains how to make AI agents not just smart, but also efficient with their time, memory, and tool use.
The paper solves a big problem: when you merge several reinforcement-learned models, their special skills get watered down by simple averaging.