InfinityStory is a new system that can make very long videos (even hours) where the world stays the same and characters transition smoothly between shots.
NOVA is a new video editor that lets you change a few key frames (sparse control) while it carefully keeps the original motion and background details (dense synthesis).
The paper introduces CMDM, a new way to make computer-generated human motions that feel smooth over time and match the meaning of a text prompt.
This paper introduces GEBench, a new test to check if image generation models can act like real app screens that change when you click or type.
This paper shows how to make text-to-video models create clearer, steadier, and more on-topic videos without using any human-labeled ratings.
This paper shows a simple, one-model way to dub videos that makes the new voice and the lips move together naturally.
This paper shows how a video generator can improve its own videos during sampling, without extra training or outside checkers.
RoboVIP is a plug-and-play tool that turns ordinary robot videos into many new, realistic, multi-view training videos without changing the original robot actions.
FlowBlending is a simple way to speed up video diffusion models by smartly choosing when to use a big model and when a small one is enough.
This paper introduces Knot Forcing, a way to make talking-head videos that look great while being generated live, frame by frame.
DreaMontage is a new AI method that makes long, single-shot videos that feel smooth and connected, even when you give it scattered images or short clips in the middle.
This paper builds a new test called Video Reality Test to see if AI-made ASMR videos can fool both people and AI video watchers (VLMs).