Groups
Category
Group theory gives a precise language for symmetries, and neural networks can exploit these symmetries to learn faster and generalize better.
A Mixture of Experts (MoE) routes each input to a small subset of specialized models called experts, enabling conditional computation.