Papers2

#Triton kernels

Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations

This paper teaches a language model to write fast GPU kernels (tiny speed programs) in Triton using reinforcement learning that really cares about meaningful speed, not just being correct.

#Triton kernels#Reinforcement learning#Policy gradient

Trainable Log-linear Sparse Attention for Efficient Diffusion Transformers

Beginner

Yifan Zhou, Zeqi Xiao et al.Dec 18arXiv

This paper introduces Log-linear Sparse Attention (LLSA), a new way for Diffusion Transformers to focus only on the most useful information using a smart, layered search.

#Log-linear Sparse Attention#Hierarchical Top-K#Hierarchical KV Enrichment