Long texts overwhelm many language models, which forget important bits and slow down as the context grows.
WorldCompass teaches video world models to follow actions better and keep pictures pretty by using reinforcement learning after pretraining.
LLaDA2.1 teaches a diffusion-style language model to write fast rough drafts and then fix its own mistakes by editing tokens it already wrote.
This paper introduces SecCoderX, a way to teach code-writing AIs to be secure without breaking what the code is supposed to do.
This paper teaches a computer to find buttons, text, and icons on screens so it can click and type in the right places, a skill called GUI grounding.
V-Retrver is a new way for AI to search across text and images by double-checking tiny visual details instead of only guessing from words.
BudgetMem is a way for AI helpers to build and use memory on the fly, picking how much thinking to spend so answers are both good and affordable.
TRIT is a new training method that teaches AI to translate and think at the same time so it can solve hard problems in many languages without extra helper models.
Rigging 3D characters is a bottleneck: making bones and skin weights by hand is slow and tricky, and past automatic tools often guess the skin weights poorly.
ERNIE 5.0 is a single giant model that can read and create text, images, video, and audio by predicting the next pieces step by step, like writing a story one line at a time.
Agent-Omit teaches AI agents to skip unneeded thinking and old observations, cutting tokens while keeping accuracy high.
Reasoning Cache (RC) is a new way for AI to think in steps: it writes some thoughts, makes a short summary, throws away the long thoughts, and then keeps going using only the summary.