🎓How I Study AIHISA

📖Read

📄Papers 📰Blogs 🎬Courses

💡Learn

🛤️Paths 📚Topics 💡Concepts 🎴Shorts

🎯Practice

📝Daily Log 🎯Prompts 🧠Review

Search Settings

How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers2

All Beginner Intermediate Advanced

All Sources arXiv

#Throughput Optimization

Revisiting Parameter Server in LLM Post-Training

Xinyi Wan, Penghui Qi et al.Jan 27arXiv

Large language model (LLM) post-training has uneven work per GPU because some text sequences are much longer than others.

#On-Demand Communication#Fully Sharded Data Parallel#Parameter Server

Not triaged yet

SonicMoE: Accelerating MoE with IO and Tile-aware Optimizations

Wentao Guo, Mayank Mishra et al.Dec 16arXiv

SonicMoE makes Mixture-of-Experts (MoE) models train faster and use less memory by redesigning how data is moved and computed on GPUs.

#Mixture of Experts#Grouped GEMM#Token Rounding

Not triaged yet