Typhoon-S is a simple, open recipe that turns a basic language model into a helpful assistant and then teaches it important local skills, all on small budgets.
DeepVerifier is a plug-in checker that helps Deep Research Agents catch and fix their own mistakes while they are working, without retraining.
Academic rebuttals are not just about being polite; they are about smart, strategic persuasion under hidden information.
The paper asks a simple question: Which step-by-step explanations from a teacher model actually help a student model learn to reason better?
This paper is the first big map of how AI can fix real software problems, not just write short code snippets.
TranslateGemma is a family of open machine translation models fine-tuned from Gemma 3 to translate many languages more accurately.
X-Coder shows that models can learn expert-level competitive programming using data that is 100% synthetic—no real contest problems needed.
Preference tuning teaches language models to act the way people like, but those habits can fall apart when the topic or style changes (domain shift).
EnvScaler is an automatic factory that builds many safe, rule-following practice worlds where AI agents can talk to users and call tools, just like real apps.
Long-term AI helpers remember past chats, but using all memories can trap them in old ideas (Memory Anchoring).
Multi-agent systems are like teams of expert helpers; the tricky part is choosing which helpers to ask for each question.
Supervised fine-tuning (SFT) often makes a model great at a new task but worse at its old skills; this paper explains a key reason why and how to fix it.