Papers5

#diffusion language models

FourierSampler: Unlocking Non-Autoregressive Potential in Diffusion Language Models via Frequency-Guided Generation

Diffusion language models (dLLMs) can write text in any order, but common decoding methods still prefer left-to-right, which wastes their superpower.

#diffusion language models#non-autoregressive generation#frequency-domain analysis

Residual Context Diffusion Language Models

Intermediate

Yuezhou Hu, Harman Singh et al.Jan 30arXiv

Diffusion language models (dLLMs) generate several tokens at once but usually throw away lots of helpful clues each step—RCD keeps and reuses those clues.

#diffusion language models#residual context diffusion#soft tokens

On the Role of Discreteness in Diffusion LLMs

Intermediate

Ziqi Jin, Bin Wang et al.Dec 27arXiv

The paper asks what a truly good diffusion-based language model should look like and lists five must-have properties.

#diffusion language models#smooth corruption#discrete tokens

Efficient-DLM: From Autoregressive to Diffusion Language Models, and Beyond in Speed

Intermediate

Yonggan Fu, Lexington Whalen et al.Dec 16arXiv

Autoregressive (AR) models write one word at a time, which is accurate but slow, especially when your computer or GPU can’t keep many tasks in memory at once.

#diffusion language models#autoregressive models#AR-to-dLM conversion

Learning Unmasking Policies for Diffusion Language Models

Intermediate

Metod Jazbec, Theo X. Olausson et al.Dec 9arXiv

Diffusion language models write by gradually unmasking hidden words, so deciding which blanks to reveal next is a big deal for both speed and accuracy.

#diffusion language models#masked diffusion#unmasking policy