VIBEVOICE-ASR is a single-pass system that listens to up to 60 minutes of audio at once and outputs who spoke, when they spoke, and what they said in one stream.
Typhoon-S is a simple, open recipe that turns a basic language model into a helpful assistant and then teaches it important local skills, all on small budgets.
Transformers slow down on very long inputs because standard attention looks at every token pair, which is expensive.
LongCat-Flash-Thinking-2601 is a huge 560-billion-parameter Mixture-of-Experts model built to act like a careful helper that can use tools, browse, code, and solve multi-step tasks.
DSGym is a unified 'gym' where AI data science agents are tested and trained by actually running code on real datasets, not just chatting about them.
IVRA is a simple, training-free add-on that helps robot brains keep the 2D shape of pictures while following language instructions.
This paper shows that giving an AI a safe, tiny virtual computer (a sandbox) lets it solve many kinds of problems better, not just coding ones.
AI agents often act very sure of themselves even when they are wrong, especially on long, multi-step tasks.
Academic rebuttals are not just about being polite; they are about smart, strategic persuasion under hidden information.
Small AI models often stumble when a tool call fails and then get stuck repeating bad calls instead of fixing the mistake.
Robots need videos that not only look pretty but also follow real-world physics and finish the task asked of them.
Diffusion language models can write tokens in any order, but that freedom can accidentally hurt their ability to reason well.