SkyReels-V4 is a single, unified model that makes videos and matching sounds together, while also letting you fix or change parts of a video.
Mobile-O is a small but smart AI that can both understand pictures and make new images, and it runs right on your phone.
JavisDiT++ is a new AI that makes short videos and matching sounds from a text prompt, keeping sight and sound in sync.
CoDance is a new way to animate many characters in one picture using just one pose video, even if the picture and the video do not line up perfectly.
SemanticGen is a new way to make videos that starts by planning in a small, high-level 'idea space' (semantic space) and then adds the tiny visual details later.
Latent diffusion models are great at making images but learn the meaning of scenes slowly because their training goal mostly teaches them to clean up noise, not to understand objects and layouts.
EgoX turns a regular third-person video into a first-person video that looks like it was filmed from the actor’s eyes.