๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐Ÿ“Daily Log๐ŸŽฏPrompts๐Ÿง Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers2

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#audio-driven animation

DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation

Intermediate
Xu Guo, Fulong Ye et al.Feb 12arXiv

DreamID-Omni is one model that can create, edit, and animate human-centered videos with matching voices, all in sync.

#audio-video generation#diffusion transformer#identity preservation

Making Avatars Interact: Towards Text-Driven Human-Object Interaction for Controllable Talking Avatars

Beginner
Youliang Zhang, Zhengguang Zhou et al.Feb 2arXiv

This paper teaches talking avatars not just to speak, but to look around their scene and handle nearby objects exactly as a text instruction says.

#grounded human-object interaction#talking avatars#diffusion transformer