πŸŽ“How I Study AIHISA
πŸ“–Read
πŸ“„PapersπŸ“°Blogs🎬Courses
πŸ’‘Learn
πŸ›€οΈPathsπŸ“šTopicsπŸ’‘Concepts🎴Shorts
🎯Practice
πŸ“Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers2

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#sparsity

On the Mechanism and Dynamics of Modular Addition: Fourier Features, Lottery Ticket, and Grokking

Intermediate
Jianliang He, Leda Wang et al.Feb 18arXiv

This paper explains, in detail, how a simple two-layer neural network learns to add numbers on a clock (modular addition) by building and combining wave-like patterns called Fourier features.

#modular addition#Fourier features#discrete Fourier transform

Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss

Intermediate
Ang Lv, Jin Ma et al.Dec 29arXiv

Mixture-of-Experts (MoE) models use many small specialist networks (experts) and a router to pick which experts handle each token, but the router isn’t explicitly taught what each expert is good at.

#Mixture-of-Experts#expert-router coupling#auxiliary loss