RelayGen: Intra-Generation Model Switching for Efficient Reasoning
IntermediateJiwon Song, Yoongon Kim et al.Feb 6arXiv
RelayGen is a training-free way to switch between a big model and a small model while one answer is being generated.
#RelayGen#intra-generation model switching#segment-level routing