LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding
IntermediateAlexander Samarin, Sergei Krutikov et al.Feb 27arXiv
Speculative decoding speeds up big language models by letting a small helper model guess several next words and having the big model check them all at once.
#speculative decoding#acceptance rate#LK losses