🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers5

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#Mixture of Experts

HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing

Intermediate
Yizhao Gao, Jianyu Wei et al.Feb 3arXiv

HySparse is a new way for AI models to pay attention that mixes a few full attention layers with many fast, memory‑saving sparse layers.

#Hybrid Sparse Attention#Oracle Token Selection#KV Cache Sharing

NativeTok: Native Visual Tokenization for Improved Image Generation

Intermediate
Bin Wu, Mengqi Huang et al.Jan 30arXiv

This paper fixes a hidden mismatch in image generation: tokenizers make tokens without order, but generators need an order to predict the next token well.

#visual tokenization#autoregressive image generation#causal dependencies

ConceptMoE: Adaptive Token-to-Concept Compression for Implicit Compute Allocation

Beginner
Zihao Huang, Jundong Zhou et al.Jan 29arXiv

ConceptMoE teaches a language model to group easy, similar tokens into bigger ideas called concepts, so it spends more brainpower on the hard parts.

#ConceptMoE#Mixture of Experts#Adaptive Compression

YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time Detection

Intermediate
Xu Lin, Jinlong Peng et al.Dec 29arXiv

YOLO-Master is a new real-time object detector that uses a Mixture-of-Experts (MoE) design to spend more compute on hard scenes and less on easy ones.

#YOLO-Master#Mixture of Experts#ES-MoE

SonicMoE: Accelerating MoE with IO and Tile-aware Optimizations

Intermediate
Wentao Guo, Mayank Mishra et al.Dec 16arXiv

SonicMoE makes Mixture-of-Experts (MoE) models train faster and use less memory by redesigning how data is moved and computed on GPUs.

#Mixture of Experts#Grouped GEMM#Token Rounding