🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers20

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#video diffusion

PISCES: Annotation-free Text-to-Video Post-Training via Optimal Transport-Aligned Rewards

Intermediate
Minh-Quan Le, Gaurav Mittal et al.Feb 2arXiv

This paper shows how to make text-to-video models create clearer, steadier, and more on-topic videos without using any human-labeled ratings.

#text-to-video#optimal transport#annotation-free

DreamActor-M2: Universal Character Image Animation via Spatiotemporal In-Context Learning

Intermediate
Mingshuang Luo, Shuang Liang et al.Jan 29arXiv

DreamActor-M2 is a new way to make a still picture move by copying motion from a video while keeping the character’s look the same.

#character image animation#spatiotemporal in-context learning#video diffusion

Advancing Open-source World Models

Intermediate
Robbyant Team, Zelin Gao et al.Jan 28arXiv

LingBot-World is an open-source world model that turns video generation into an interactive, real-time simulator.

#world model#video diffusion#causal attention

Self-Refining Video Sampling

Intermediate
Sangwon Jang, Taekyung Ki et al.Jan 26arXiv

This paper shows how a video generator can improve its own videos during sampling, without extra training or outside checkers.

#video generation#flow matching#denoising autoencoder

Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control and Planning

Intermediate
Moo Jin Kim, Yihuai Gao et al.Jan 22arXiv

Cosmos Policy teaches robots to act by fine-tuning a powerful video model in just one training stage, without changing the model’s architecture.

#video diffusion#robot policy learning#visuomotor control

VideoMaMa: Mask-Guided Video Matting via Generative Prior

Intermediate
Sangbeom Lim, Seoung Wug Oh et al.Jan 20arXiv

VideoMaMa is a model that turns simple black-and-white object masks into soft, precise cutouts (alpha mattes) for every frame of a video.

#video matting#alpha matte#binary segmentation mask

Transition Matching Distillation for Fast Video Generation

Intermediate
Weili Nie, Julius Berner et al.Jan 14arXiv

Big video makers (diffusion models) create great videos but are too slow because they use hundreds of tiny clean-up steps.

#video diffusion#distillation#transition matching

Motion Attribution for Video Generation

Intermediate
Xindi Wu, Despoina Paschalidou et al.Jan 13arXiv

Motive is a new way to figure out which training videos teach an AI how to move things realistically, not just how they look.

#motion attribution#video diffusion#optical flow

MoCha:End-to-End Video Character Replacement without Structural Guidance

Intermediate
Zhengbo Xu, Jie Ma et al.Jan 13arXiv

MoCha is a new AI that swaps a person in a video with a new character using only one mask on one frame and a few reference photos.

#video diffusion#character replacement#in-context learning

RoboVIP: Multi-View Video Generation with Visual Identity Prompting Augments Robot Manipulation

Intermediate
Boyang Wang, Haoran Zhang et al.Jan 8arXiv

RoboVIP is a plug-and-play tool that turns ordinary robot videos into many new, realistic, multi-view training videos without changing the original robot actions.

#robotic manipulation#video diffusion#multi-view generation

FlowBlending: Stage-Aware Multi-Model Sampling for Fast and High-Fidelity Video Generation

Intermediate
Jibin Song, Mingi Kwon et al.Dec 31arXiv

FlowBlending is a simple way to speed up video diffusion models by smartly choosing when to use a big model and when a small one is enough.

#FlowBlending#stage-aware sampling#video diffusion

Yume-1.5: A Text-Controlled Interactive World Generation Model

Intermediate
Xiaofeng Mao, Zhen Li et al.Dec 26arXiv

Yume1.5 is a model that turns text or a single image into a living, explorable video world you can move through with keyboard keys.

#interactive world generation#video diffusion#temporal-spatial-channel modeling
12