🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers2

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#Kernel Fusion

Scaling Embeddings Outperforms Scaling Experts in Language Models

Intermediate
Hong Liu, Jiaqi Zhang et al.Jan 29arXiv

The paper shows that growing the embedding part of a language model (especially with n-grams) can beat adding more MoE experts once you pass a certain sparsity 'sweet spot.'

#N-gram Embedding#Mixture-of-Experts (MoE)#Embedding Scaling

mHC: Manifold-Constrained Hyper-Connections

Intermediate
Zhenda Xie, Yixuan Wei et al.Dec 31arXiv

The paper fixes a stability problem in Hyper-Connections (HC) by gently steering the network’s mixing matrix onto a safe shape (a manifold) where signals don’t blow up or vanish.

#Residual Connections#Hyper-Connections#Manifold Projection