This paper teaches talking avatars not just to speak, but to look around their scene and handle nearby objects exactly as a text instruction says.
This paper shows a simple, one-model way to dub videos that makes the new voice and the lips move together naturally.
This paper builds a real-time talking-listening head avatar that reacts naturally to your words, tone, nods, and smiles in about half a second.
KlingAvatar 2.0 is a system that makes long, sharp, lifelike talking-person videos that follow audio, images, and text instructions all at once.