DFlash: Block Diffusion for Flash Speculative Decoding
IntermediateJian Chen, Yesheng Liang et al.Feb 5arXiv
DFlash is a new way to make big language models answer much faster without changing the final answers.
#DFlash#speculative decoding#diffusion language model