The paper fixes a big problem in long video generation: models either forget what happened or slowly drift off-topic over time.
RISE-Video is a new test that checks whether video-making AIs follow hidden world rules, not just make pretty pictures.
VideoMaMa is a model that turns simple black-and-white object masks into soft, precise cutouts (alpha mattes) for every frame of a video.
Motion 3-to-4 turns a single regular video into a moving 3D object over time (a 4D asset) by first getting the object’s shape and then figuring out how every part moves.
CoDance is a new way to animate many characters in one picture using just one pose video, even if the picture and the video do not line up perfectly.
FlowAct-R1 is a new system that makes lifelike human videos in real time, so the on-screen person can react quickly as you talk to them.
Motive is a new way to figure out which training videos teach an AI how to move things realistically, not just how they look.
MoCha is a new AI that swaps a person in a video with a new character using only one mask on one frame and a few reference photos.
DrivingGen is a new, all-in-one test that fairly checks how well AI can imagine future driving videos and motions.
DreamID-V is a new AI method that swaps faces in videos while keeping the body movements, expressions, lighting, and background steady and natural.
MorphAny3D is a training-free way to smoothly change one 3D object into another, even if they are totally different (like a bee into a biplane).
This paper makes diffusion-based video super-resolution (VSR) practical for live, low-latency use by removing the need for future frames and cutting denoising from ~50 steps down to just 4.