🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers20

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#video diffusion

Over++: Generative Video Compositing for Layer Interaction Effects

Intermediate
Luchao Qi, Jiaye Wu et al.Dec 22arXiv

Over++ is a video AI that adds realistic effects like shadows, splashes, dust, and smoke between a foreground and a background without changing the original footage.

#augmented compositing#video diffusion#video inpainting

StoryMem: Multi-shot Long Video Storytelling with Memory

Intermediate
Kaiwen Zhang, Liming Jiang et al.Dec 22arXiv

StoryMem is a new way to make minute‑long, multi‑shot videos that keep the same characters, places, and style across many clips.

#StoryMem#Memory-to-Video#multi-shot video generation

Animate Any Character in Any World

Intermediate
Yitong Wang, Fangyun Wei et al.Dec 18arXiv

AniX is a system that lets you place any character into any 3D world and control them with plain language, like “run forward” or “play a guitar.”

#AniX#3D Gaussian Splatting#world models

FlashPortrait: 6x Faster Infinite Portrait Animation with Adaptive Latent Prediction

Intermediate
Shuyuan Tu, Yueming Pan et al.Dec 18arXiv

FlashPortrait makes talking-portrait videos that keep a person’s identity steady for as long as you want—minutes or even hours.

#FlashPortrait#portrait animation#identity consistency

LongVie 2: Multimodal Controllable Ultra-Long Video World Model

Intermediate
Jianxiong Gao, Zhaoxi Chen et al.Dec 15arXiv

LongVie 2 is a video world model that can generate controllable videos for 3–5 minutes while keeping the look and motion steady over time.

#long video generation#world model#multimodal control

KlingAvatar 2.0 Technical Report

Intermediate
Kling Team, Jialu Chen et al.Dec 15arXiv

KlingAvatar 2.0 is a system that makes long, sharp, lifelike talking-person videos that follow audio, images, and text instructions all at once.

#audio-driven avatar#video diffusion#diffusion transformer

Structure From Tracking: Distilling Structure-Preserving Motion for Video Generation

Intermediate
Yang Fei, George Stoica et al.Dec 12arXiv

The paper teaches a video generator to move things realistically by borrowing motion knowledge from a strong video tracker.

#video diffusion#structure-preserving motion#SAM2

SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations

Intermediate
Wenhao Yan, Sheng Ye et al.Dec 5arXiv

SCAIL is a new AI system that turns a single character image into a studio-quality animation by following the moves in a driving video.

#character animation#3D pose representation#occlusion-aware pose
12