🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers7

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#identity preservation

DreamActor-M2: Universal Character Image Animation via Spatiotemporal In-Context Learning

Intermediate
Mingshuang Luo, Shuang Liang et al.Jan 29arXiv

DreamActor-M2 is a new way to make a still picture move by copying motion from a video while keeping the character’s look the same.

#character image animation#spatiotemporal in-context learning#video diffusion

CoDance: An Unbind-Rebind Paradigm for Robust Multi-Subject Animation

Intermediate
Shuai Tan, Biao Gong et al.Jan 16arXiv

CoDance is a new way to animate many characters in one picture using just one pose video, even if the picture and the video do not line up perfectly.

#multi-subject animation#pose-guided video generation#Unbind–Rebind paradigm

Alterbute: Editing Intrinsic Attributes of Objects in Images

Intermediate
Tal Reiss, Daniel Winter et al.Jan 15arXiv

Alterbute is a diffusion-based method that changes an object's intrinsic attributes (color, texture, material, shape) in a photo while keeping the object's identity and the scene intact.

#intrinsic attribute editing#visual named entities#identity preservation

MoCha:End-to-End Video Character Replacement without Structural Guidance

Intermediate
Zhengbo Xu, Jie Ma et al.Jan 13arXiv

MoCha is a new AI that swaps a person in a video with a new character using only one mask on one frame and a few reference photos.

#video diffusion#character replacement#in-context learning

Knot Forcing: Taming Autoregressive Video Diffusion Models for Real-time Infinite Interactive Portrait Animation

Intermediate
Steven Xiao, Xindi Zhang et al.Dec 25arXiv

This paper introduces Knot Forcing, a way to make talking-head videos that look great while being generated live, frame by frame.

#Knot Forcing#autoregressive video diffusion#temporal knot

The World is Your Canvas: Painting Promptable Events with Reference Images, Trajectories, and Text

Intermediate
Hanlin Wang, Hao Ouyang et al.Dec 18arXiv

WorldCanvas lets you make videos where things happen exactly how you ask by combining three inputs: text (what happens), drawn paths called trajectories (when and where it happens), and reference images (who it is).

#WorldCanvas#promptable world events#trajectory-controlled video generation

KlingAvatar 2.0 Technical Report

Intermediate
Kling Team, Jialu Chen et al.Dec 15arXiv

KlingAvatar 2.0 is a system that makes long, sharp, lifelike talking-person videos that follow audio, images, and text instructions all at once.

#audio-driven avatar#video diffusion#diffusion transformer