Short videos are easy for AI to make sharp and lively, but long videos need stories and memory, and there isn’t much training data for that.
RoboCurate is a way to make better robot training videos by checking if the actions in a generated video actually match what a robot would do in a simulator.
DreamZero is a robot brain that learns actions by predicting short videos of the future and the matching moves at the same time.
Video generators are slow because attention looks at everything, which takes a lot of time.
PISCO is a video AI that lets you place a specific object into a real video exactly where and when you want, using just a few keyframes instead of editing every frame.
This paper shows how to make text-to-video models create clearer, steadier, and more on-topic videos without using any human-labeled ratings.
DreamActor-M2 is a new way to make a still picture move by copying motion from a video while keeping the character’s look the same.
LingBot-World is an open-source world model that turns video generation into an interactive, real-time simulator.
This paper shows how a video generator can improve its own videos during sampling, without extra training or outside checkers.
Cosmos Policy teaches robots to act by fine-tuning a powerful video model in just one training stage, without changing the model’s architecture.
VideoMaMa is a model that turns simple black-and-white object masks into soft, precise cutouts (alpha mattes) for every frame of a video.
Big video makers (diffusion models) create great videos but are too slow because they use hundreds of tiny clean-up steps.