DIFFA-2 is a new audio AI that listens to speech, sounds, and music and answers questions about them using a diffusion-style language model instead of the usual step-by-step (autoregressive) method.
This paper speeds up diffusion language models (dLLMs) by changing the order in which they fill in missing words.