CubeComposer is a new AI method that turns a normal forward-facing video into a full 360° VR video at true 4K quality without using super-resolution upscaling.
NOVA is a new video editor that lets you change a few key frames (sparse control) while it carefully keeps the original motion and background details (dense synthesis).
HiFi-Inpaint is a new AI method that fills a missing area in a photo of a person by inserting a specific product, while keeping tiny details like logos, textures, and small text crisp.
Kiwi-Edit is a new video editor that follows your words and also copies looks from a picture you give it.
The paper argues that to build an AI that truly understands and simulates the real world, it must be consistent in three ways at once: across different senses (modal), across 3D space (spatial), and across time (temporal).
Modern image generators can still make strange mistakes like extra fingers or melted faces, and today’s vision-language models (VLMs) often miss them.
Mobile-O is a small but smart AI that can both understand pictures and make new images, and it runs right on your phone.
DreamID-Omni is one model that can create, edit, and animate human-centered videos with matching voices, all in sync.
Stroke3D lets you draw simple 2D stick-figure strokes plus a short text, and it builds a ready-to-animate 3D model with a skeleton and textures.
MOVA is an open-source AI that makes videos and sounds at the same time so mouths, actions, and noises match perfectly.
SkyReels-V3 is a single AI model that can make videos in three ways: from reference images, by extending an existing video, and by creating talking avatars from audio.
This paper shows how to turn any normal photo or video into a seamless 360° panorama without needing the camera’s settings like field of view or tilt.