This paper shows that great image understanding features alone are not enough for making great images; you also need strong pixel-level detail.
Sparse-LaViDa makes diffusion-style AI models much faster by skipping unhelpful masked tokens during generation while keeping quality the same.