🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers1

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#DeepSeek-V2

Janus: Disaggregating Attention and Experts for Scalable MoE Inference

Intermediate
Zhexiang Zhang, Ye Wang et al.Dec 15arXiv

Janus splits a Mixture-of-Experts (MoE) model into two parts—attention and experts—so each can use just the right amount of GPUs.

#Mixture-of-Experts inference#disaggregated serving#activation load balancing