Ex-Omni is a new open-source AI system that can understand text or speech and then talk back while moving a 3D face in sync with the voice.
ERNIE 5.0 is a single giant model that can read and create text, images, video, and audio by predicting the next pieces step by step, like writing a story one line at a time.
This paper argues that true world models are not just sprinkling facts into single tasks, but building a unified system that can see, think, remember, act, and generate across many situations.
DuetSVG is a new AI that learns to make SVG graphics by generating an image and the matching SVG code together, like sketching first and then tracing neatly.