This paper shows how to make text-to-video models create clearer, steadier, and more on-topic videos without using any human-labeled ratings.
DreamActor-M2 is a new way to make a still picture move by copying motion from a video while keeping the character’s look the same.
LingBot-World is an open-source world model that turns video generation into an interactive, real-time simulator.
This paper shows how a video generator can improve its own videos during sampling, without extra training or outside checkers.
Cosmos Policy teaches robots to act by fine-tuning a powerful video model in just one training stage, without changing the model’s architecture.
VideoMaMa is a model that turns simple black-and-white object masks into soft, precise cutouts (alpha mattes) for every frame of a video.
Big video makers (diffusion models) create great videos but are too slow because they use hundreds of tiny clean-up steps.
Motive is a new way to figure out which training videos teach an AI how to move things realistically, not just how they look.
MoCha is a new AI that swaps a person in a video with a new character using only one mask on one frame and a few reference photos.
Video models can now be told what physical result you want (like “make this ball move left with a strong push”) using Goal Force, instead of just vague text or a final picture.
RoboVIP is a plug-and-play tool that turns ordinary robot videos into many new, realistic, multi-view training videos without changing the original robot actions.
SpaceTimePilot is a video AI that lets you steer both where the camera goes (space) and how the action plays (time) from one input video.