The paper trains language models to solve hard problems by first breaking them into smaller parts and then solving those parts, instead of only thinking in one long chain.
MemSkill turns memory operations for AI agents into learnable skills instead of fixed, hand-made rules.
This paper shows how to safely make a neural network wider in the middle of training without it freaking out.
This paper shows that comics (multi-panel pictures with words) can help AI think through problems step by step, just like a student explains their work.
RANKVIDEO is a video-native reasoning reranker that helps search engines find the right videos for a text query by directly looking at the video’s visuals and audio, not just text captions.
UniReason is a single, unified model that plans with world knowledge before making an image and then edits its own result to fix mistakes, like a student drafting and revising an essay.
The paper tackles a new kind of search called Wide Research, where an AI must gather lots of related facts under complex rules and put them into a clean table.
SLIME is a new way to train chatbots so they follow human preferences without forgetting how to write well.
The paper introduces UnifiedReward-Flex, a reward model that judges images and videos the way a thoughtful human would—by flexibly changing what it checks based on the prompt and the visual evidence.
SWE-Universe is a factory-like system that turns real GitHub pull requests into safe, repeatable coding practice worlds with automatic checkers.
The paper shows that three popular ways to control language models—fine-tuning a few weights, LoRA, and activation steering—are actually the same kind of action: a dynamic weight update driven by a control knob.
This paper proposes ReSID, a new way to turn items into short token codes (Semantic IDs) that are much easier for a recommender to predict.