This paper tackles dataset distillation by giving a clear, math-backed way to keep only the most useful bits of data, so models can learn well from far fewer images.
This paper shows that many reasoning failures in AI are caused by just a few distracting words in the prompt, not because the problems are too hard.
The paper shows that growing the embedding part of a language model (especially with n-grams) can beat adding more MoE experts once you pass a certain sparsity 'sweet spot.'
The paper asks a simple question: if a language model becomes better at step-by-step reasoning (using RLVR), do its text embeddings also get better? The short answer is no.
Multimodal AI models can mix up what they see and what they hear, making things up across senses; this is called cross-modal hallucination.
This paper builds a big, reusable library of computer skills so an AI can use Windows apps more like a careful human, not a clumsy robot.
This paper introduces Foundation-Sec-8B-Reasoning, a small (8 billion parameter) AI model that is trained to “think out loud” before answering cybersecurity questions.
This paper shows that making short videos can help AI plan and reason in pictures better than writing out steps in text.
DeepSearchQA is a new test with 900 real-world style questions that checks if AI agents can find complete lists of answers, not just one fact.
Language models store ideas along straight-line directions inside their brains (representations), like sliders for “truth” or “ethics.”
Idea2Story is a two-stage system that first studies many accepted research papers offline and then uses that knowledge online to turn a vague idea into a full scientific plan.
When training smart language models with RL that use right-or-wrong rewards, learning can stall on 'saturated' problems that the model almost always solves.