This paper teaches a language-model agent to look up facts in millions of scientific paper summaries and answer clear, single-answer questions.
This paper teaches AI to turn simple dialogue into full movie scenes by first writing a detailed script and then filming it step by step.
SAMTok turns any object’s mask in an image into just two special “words” so language models can handle pixels like they handle text.
The paper introduces Intervention Training (InT), a simple way for a language model to find and fix the first wrong step in its own reasoning using a short, targeted correction.
DARC teaches big language models to get smarter by splitting training into two calm, well-organized steps instead of one chaotic loop.
ToolPRMBench is a new benchmark that checks, step by step, whether an AI agent using tools picks the right next action.
This paper teaches video-making AIs to follow real-world physics, so rolling balls roll right and collisions look believable.
MatchTIR teaches AI agents to judge each tool call step-by-step instead of giving the same reward to every step.
Large language models usually get only a final thumbs-up or thumbs-down at the end of their answer, which is too late to fix mistakes made in the middle.
ToolSafe is a new way to keep AI agents safe when they use external tools, by checking each action before it runs.
The paper introduces M^4olGen, a two-stage system that designs new molecules to match exact numbers for several properties (like QED, LogP, MW, HOMO, LUMO) at the same time.
Fast-ThinkAct teaches a robot to plan with a few tiny hidden "thought tokens" instead of long paragraphs, making it much faster while staying smart.