LLM agents are usually trained in a few worlds but asked to work in many different, unseen worlds, which often hurts their performance.
This paper teaches a language-model agent to look up facts in millions of scientific paper summaries and answer clear, single-answer questions.
SAGE is a two-agent system that automatically writes tough, multi-step search questions and checks them by actually trying to solve them.
VIBEVOICE-ASR is a single-pass system that listens to up to 60 minutes of audio at once and outputs who spoke, when they spoke, and what they said in one stream.
The paper tackles understanding super long, first‑person videos (days to a week) by giving an AI a smarter memory and better tools.
The paper shows how to speed up reinforcement learning (RL) for large language models (LLMs) by making numbers smaller (FP8) without breaking training.
DeepPlanning is a new benchmark that tests whether AI can make long, realistic plans that fit time and money limits.
Typhoon-S is a simple, open recipe that turns a basic language model into a helpful assistant and then teaches it important local skills, all on small budgets.
FABLE is a new retrieval system that helps AI find and combine facts from many documents by letting the AI both organize the library and choose the right shelves to read.
DRPG is a four-step AI helper that writes strong academic rebuttals by first breaking a review into parts, then fetching evidence, planning a strategy, and finally writing the response.
The paper turns the 'holes' (missing spots) in depth camera images into helpful training hints instead of treating them as garbage.
This paper builds a fair, big playground (a benchmark) to test many EEG foundation models side-by-side on the same rules.