This paper shows how to make text-to-video models create clearer, steadier, and more on-topic videos without using any human-labeled ratings.
This paper shows a simple, one-model way to dub videos that makes the new voice and the lips move together naturally.
This paper shows how a video generator can improve its own videos during sampling, without extra training or outside checkers.
RoboVIP is a plug-and-play tool that turns ordinary robot videos into many new, realistic, multi-view training videos without changing the original robot actions.
FlowBlending is a simple way to speed up video diffusion models by smartly choosing when to use a big model and when a small one is enough.
This paper introduces Knot Forcing, a way to make talking-head videos that look great while being generated live, frame by frame.
DreaMontage is a new AI method that makes long, single-shot videos that feel smooth and connected, even when you give it scattered images or short clips in the middle.
This paper builds a new test called Video Reality Test to see if AI-made ASMR videos can fool both people and AI video watchers (VLMs).