CoDance is a new way to animate many characters in one picture using just one pose video, even if the picture and the video do not line up perfectly.
SemanticGen is a new way to make videos that starts by planning in a small, high-level 'idea space' (semantic space) and then adds the tiny visual details later.
Latent diffusion models are great at making images but learn the meaning of scenes slowly because their training goal mostly teaches them to clean up noise, not to understand objects and layouts.
EgoX turns a regular third-person video into a first-person video that looks like it was filmed from the actor’s eyes.