OCR is like reading a page exactly as it is, and that strictness makes it perfect for fast, parallel generation.
Diffusion language models (dLLMs) can write all parts of an answer in parallel, but they usually take many tiny cleanup steps, which makes them slow.