GlimpRouter: Efficient Collaborative Inference by Glimpsing One Token of Thoughts
BeginnerWenhao Zeng, Xuteng Zhang et al.Jan 8arXiv
Big reasoning AIs think in many steps, which is slow and costly.
#collaborative inference#initial token entropy#step-level routing