Papers2

All Beginner Intermediate Advanced

All Sources arXiv

#block-wise generation

LLaDA-o: An Effective and Length-Adaptive Omni Diffusion Model

Intermediate

Zebin You, Xiaolu Zhang et al.Mar 1arXiv

LLaDA-o is a new AI that understands pictures and text and can also make images, all in one model.

#LLaDA-o#Mixture of Diffusion#masked diffusion models

Not triaged yet

LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation

Beginner

Ethan Chern, Zhulin Hu et al.Dec 29arXiv

LiveTalk turns slow, many-step video diffusion into a fast, 4-step, real-time system for talking avatars that listen, think, and respond with synchronized video.

#real-time video diffusion#on-policy distillation#multimodal conditioning

Not triaged yet