CubeComposer is a new AI method that turns a normal forward-facing video into a full 360° VR video at true 4K quality without using super-resolution upscaling.
The paper trains one model from scratch to both read text and see images/videos, instead of starting from a language-only model.
HiFi-Inpaint is a new AI method that fills a missing area in a photo of a person by inserting a specific product, while keeping tiny details like logos, textures, and small text crisp.
Kiwi-Edit is a new video editor that follows your words and also copies looks from a picture you give it.
DreamWorld is a new way to make videos that not only look real but also follow common-sense rules about motion, space, and meaning.
Short videos are easy for AI to make sharp and lively, but long videos need stories and memory, and there isn’t much training data for that.
SenCache speeds up video diffusion models by reusing past answers only when the model is predicted to change very little.
SkyReels-V4 is a single, unified model that makes videos and matching sounds together, while also letting you fix or change parts of a video.
Diffusion models make great images but are slow because they fix noise step by step many times.
Mobile-O is a small but smart AI that can both understand pictures and make new images, and it runs right on your phone.
JavisDiT++ is a new AI that makes short videos and matching sounds from a text prompt, keeping sight and sound in sync.
SARAH is a real-time system that makes virtual characters move their whole bodies naturally during a conversation while knowing where the user is.