Trainable Log-linear Sparse Attention for Efficient Diffusion Transformers
BeginnerYifan Zhou, Zeqi Xiao et al.Dec 18arXiv
This paper introduces Log-linear Sparse Attention (LLSA), a new way for Diffusion Transformers to focus only on the most useful information using a smart, layered search.
#Log-linear Sparse Attention#Hierarchical Top-K#Hierarchical KV Enrichment