LLaDA2.1 teaches a diffusion-style language model to write fast rough drafts and then fix its own mistakes by editing tokens it already wrote.
Autoregressive (AR) models normally write one token at a time, which is accurate but slow for long answers.