🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers22

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#video diffusion

PISCES: Annotation-free Text-to-Video Post-Training via Optimal Transport-Aligned Rewards

Intermediate
Minh-Quan Le, Gaurav Mittal et al.Feb 2arXiv

This paper shows how to make text-to-video models create clearer, steadier, and more on-topic videos without using any human-labeled ratings.

#text-to-video#optimal transport#annotation-free

DreamActor-M2: Universal Character Image Animation via Spatiotemporal In-Context Learning

Intermediate
Mingshuang Luo, Shuang Liang et al.Jan 29arXiv

DreamActor-M2 is a new way to make a still picture move by copying motion from a video while keeping the character’s look the same.

#character image animation#spatiotemporal in-context learning#video diffusion

Advancing Open-source World Models

Intermediate
Robbyant Team, Zelin Gao et al.Jan 28arXiv

LingBot-World is an open-source world model that turns video generation into an interactive, real-time simulator.

#world model#video diffusion#causal attention

Self-Refining Video Sampling

Intermediate
Sangwon Jang, Taekyung Ki et al.Jan 26arXiv

This paper shows how a video generator can improve its own videos during sampling, without extra training or outside checkers.

#video generation#flow matching#denoising autoencoder

Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control and Planning

Intermediate
Moo Jin Kim, Yihuai Gao et al.Jan 22arXiv

Cosmos Policy teaches robots to act by fine-tuning a powerful video model in just one training stage, without changing the model’s architecture.

#video diffusion#robot policy learning#visuomotor control

VideoMaMa: Mask-Guided Video Matting via Generative Prior

Intermediate
Sangbeom Lim, Seoung Wug Oh et al.Jan 20arXiv

VideoMaMa is a model that turns simple black-and-white object masks into soft, precise cutouts (alpha mattes) for every frame of a video.

#video matting#alpha matte#binary segmentation mask

Transition Matching Distillation for Fast Video Generation

Intermediate
Weili Nie, Julius Berner et al.Jan 14arXiv

Big video makers (diffusion models) create great videos but are too slow because they use hundreds of tiny clean-up steps.

#video diffusion#distillation#transition matching

Motion Attribution for Video Generation

Intermediate
Xindi Wu, Despoina Paschalidou et al.Jan 13arXiv

Motive is a new way to figure out which training videos teach an AI how to move things realistically, not just how they look.

#motion attribution#video diffusion#optical flow

MoCha:End-to-End Video Character Replacement without Structural Guidance

Intermediate
Zhengbo Xu, Jie Ma et al.Jan 13arXiv

MoCha is a new AI that swaps a person in a video with a new character using only one mask on one frame and a few reference photos.

#video diffusion#character replacement#in-context learning

Goal Force: Teaching Video Models To Accomplish Physics-Conditioned Goals

Beginner
Nate Gillman, Yinghua Zhou et al.Jan 9arXiv

Video models can now be told what physical result you want (like “make this ball move left with a strong push”) using Goal Force, instead of just vague text or a final picture.

#goal force#force vector control#visual planning

RoboVIP: Multi-View Video Generation with Visual Identity Prompting Augments Robot Manipulation

Intermediate
Boyang Wang, Haoran Zhang et al.Jan 8arXiv

RoboVIP is a plug-and-play tool that turns ordinary robot videos into many new, realistic, multi-view training videos without changing the original robot actions.

#robotic manipulation#video diffusion#multi-view generation

SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time

Beginner
Zhening Huang, Hyeonho Jeong et al.Dec 31arXiv

SpaceTimePilot is a video AI that lets you steer both where the camera goes (space) and how the action plays (time) from one input video.

#video diffusion#space–time disentanglement#camera control
12