MMFineReason is a huge, open dataset (1.8 million examples, 5.1 billion solution tokens) that teaches AIs to think step by step about pictures and text together.
ASTRA is a fully automated way to train tool-using AI agents by making both their practice stories (trajectories) and their practice worlds (environments) without humans in the loop.
SERA is a new, low-cost way to train coding helpers (agents) that learn the style and secrets of your own codebase.
OmegaUse is a new AI that can use phones and computers by looking at screenshots and deciding where to click, type, or scroll—much like a careful human user.
SimpleSeg teaches a multimodal language model to outline objects by writing down a list of points, like connecting the dots, instead of using a special segmentation decoder.
This paper teaches code AIs to work more like real software engineers by training them in the middle of their learning using real development workflows.
DeepVerifier is a plug-in checker that helps Deep Research Agents catch and fix their own mistakes while they are working, without retraining.
The paper asks a simple question: Which step-by-step explanations from a teacher model actually help a student model learn to reason better?
TranslateGemma is a family of open machine translation models fine-tuned from Gemma 3 to translate many languages more accurately.
X-Coder shows that models can learn expert-level competitive programming using data that is 100% synthetic—no real contest problems needed.
Preference tuning teaches language models to act the way people like, but those habits can fall apart when the topic or style changes (domain shift).
EnvScaler is an automatic factory that builds many safe, rule-following practice worlds where AI agents can talk to users and call tools, just like real apps.