Speculative decoding speeds up big language models by letting a small helper model guess several next words and having the big model check them all at once.
ARBITRAGE makes AI solve step-by-step problems faster by only using the big, slow model when it is predicted to truly help.