How I Study AI - Learn AI Papers & Lectures the Easy Way

Efficient Autoregressive Video Diffusion with Dummy Head

Intermediate

Hang Guo, Zhaoyang Jia et al.Jan 28arXiv

This paper finds that about 1 out of every 4 attention heads in autoregressive video diffusion models mostly looks only at the current frame and almost ignores the past, wasting memory and time.

#autoregressive video diffusion#multi-head self-attention#KV cache compression

Papers1

Efficient Autoregressive Video Diffusion with Dummy Head