ERNIE 5.0 is a single giant model that can read and create text, images, video, and audio by predicting the next pieces step by step, like writing a story one line at a time.
LingBot-World is an open-source world model that turns video generation into an interactive, real-time simulator.
JudgeRLVR teaches a model to be a strict judge of answers before it learns to generate them, which trims bad ideas early.
Coding agents used to fix software rely on feedback; unit tests give only pass/fail signals that are often noisy or missing.
The paper introduces Canon layers, tiny add-ons that let nearby words share information directly, like passing notes along a row of desks.
Before this work, most big language models talked one word at a time (autoregressive), which made them slow and hard to parallelize.
ProPhy is a new two-step method that helps video AIs follow real-world physics, not just make pretty pictures.