Papers6

All Beginner Intermediate Advanced

All Sources arXiv

#block diffusion

dLLM: Simple Diffusion Language Modeling

Intermediate

Zhanhui Zhou, Lingjie Chen et al.Feb 26arXiv

dLLM is a single, open-source toolbox that standardizes how diffusion language models are trained, run, and tested.

#diffusion language models#masked diffusion#block diffusion

Not triaged yet

DFlash: Block Diffusion for Flash Speculative Decoding

Intermediate

Jian Chen, Yesheng Liang et al.Feb 5arXiv

DFlash is a new way to make big language models answer much faster without changing the final answers.

#DFlash#speculative decoding#diffusion language model

Not triaged yet

Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model

Intermediate

Chenghao Fan, Wen Heng et al.Jan 22arXiv

Stable-DiffCoder is a code-focused diffusion language model that learns to write and edit programs by filling in masked pieces, not just predicting the next token.

#diffusion language model#block diffusion#code generation

Not triaged yet

Diffusion In Diffusion: Reclaiming Global Coherence in Semi-Autoregressive Diffusion

Intermediate

Linrui Ma, Yufei Cui et al.Jan 20arXiv

The paper proposes Diffusion in Diffusion, a draft-then-revise method that brings back global coherence to fast, block-based diffusion language models.

#discrete diffusion#block diffusion#semi-autoregressive

Not triaged yet

DiffusionVL: Translating Any Autoregressive Models into Diffusion Vision Language Models

Intermediate

Lunbin Zeng, Jingfeng Yao et al.Dec 17arXiv

This paper shows a simple way to turn any strong autoregressive (step-by-step) model into a diffusion vision-language model (parallel, block-by-block) without changing the architecture.

#DiffusionVL#diffusion vision-language model#block diffusion

Not triaged yet

LLaDA2.0: Scaling Up Diffusion Language Models to 100B

Intermediate

Tiwei Bie, Maosong Cao et al.Dec 10arXiv

Before this work, most big language models talked one word at a time (autoregressive), which made them slow and hard to parallelize.

#diffusion language model#masked diffusion#block diffusion

Not triaged yet