Papers2

#Load balancing loss

The Illusion of Specialization: Unveiling the Domain-Invariant "Standing Committee" in Mixture-of-Experts Models

Mixture-of-Experts (MoE) language models don’t split cleanly into domain specialists; instead, a small, stable group of experts gets chosen again and again across many subjects.

#Mixture-of-Experts#Standing Committee#Sparse routing

Not triaged yet

YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time Detection

Intermediate

Xu Lin, Jinlong Peng et al.Dec 29arXiv

YOLO-Master is a new real-time object detector that uses a Mixture-of-Experts (MoE) design to spend more compute on hard scenes and less on easy ones.

#YOLO-Master#Mixture of Experts#ES-MoE

Not triaged yet