The Illusion of Specialization: Unveiling the Domain-Invariant "Standing Committee" in Mixture-of-Experts Models
IntermediateYan Wang, Yitao Xu et al.Jan 6arXiv
Mixture-of-Experts (MoE) language models donβt split cleanly into domain specialists; instead, a small, stable group of experts gets chosen again and again across many subjects.
#Mixture-of-Experts#Standing Committee#Sparse routing