🎓How I Study AIHISA

📖Read

📄Papers 📰Blogs 🎬Courses

💡Learn

🛤️Paths 📚Topics 💡Concepts 🎴Shorts

🎯Practice

📝Daily Log 🎯Prompts 🧠Review

Search Settings

How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers4

All Beginner Intermediate Advanced

All Sources arXiv

#lip synchronization

MOVA: Towards Scalable and Synchronized Video-Audio Generation

SII-OpenMOSS Team, Donghua Yu et al.Feb 9arXiv

MOVA is an open-source AI that makes videos and sounds at the same time so mouths, actions, and noises match perfectly.

#video-audio generation#lip synchronization#dual-tower architecture

Not triaged yet

JUST-DUB-IT: Video Dubbing via Joint Audio-Visual Diffusion

Anthony Chen, Naomi Ken Korem et al.Jan 29arXiv

This paper shows a simple, one-model way to dub videos that makes the new voice and the lips move together naturally.

#video dubbing#audio-visual diffusion#joint generation

Not triaged yet

Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation

Taekyung Ki, Sangwon Jang et al.Jan 2arXiv

This paper builds a real-time talking-listening head avatar that reacts naturally to your words, tone, nods, and smiles in about half a second.

#interactive avatar#talking head generation#causal diffusion forcing

Not triaged yet

KlingAvatar 2.0 Technical Report

Kling Team, Jialu Chen et al.Dec 15arXiv

KlingAvatar 2.0 is a system that makes long, sharp, lifelike talking-person videos that follow audio, images, and text instructions all at once.

#audio-driven avatar#video diffusion#diffusion transformer

Not triaged yet