๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐Ÿ“Daily Log๐ŸŽฏPrompts๐Ÿง Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers5

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#Multi-Token Prediction

Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters

Intermediate
Ailin Huang, Ang Li et al.Feb 11arXiv

Step 3.5 Flash is a huge but efficient AI that keeps 196 billion total parameters but only wakes up about 11 billion per token, so it thinks smart and fast.

#Sparse Mixture-of-Experts#Sliding-Window Attention#Head-wise Gated Attention

Qwen3-TTS Technical Report

Intermediate
Hangrui Hu, Xinfa Zhu et al.Jan 22arXiv

Qwen3-TTS is a family of text-to-speech models that can talk in 10+ languages, clone a new voice from just 3 seconds, and follow detailed style instructions in real time.

#Qwen3-TTS#text-to-speech#voice cloning

MiMo-V2-Flash Technical Report

Intermediate
Xiaomi LLM-Core Team, : et al.Jan 6arXiv

MiMo-V2-Flash is a giant but efficient language model that uses a team-of-experts design to think well while staying fast.

#Mixture-of-Experts#Sliding Window Attention#Global Attention

K-EXAONE Technical Report

Intermediate
Eunbi Choi, Kibong Choi et al.Jan 5arXiv

K-EXAONE is a super-sized language model that speaks six languages and can read very long documents (up to 256,000 tokens) without forgetting important details.

#Mixture-of-Experts#Hybrid Attention#Sliding Window Attention

NVIDIA Nemotron 3: Efficient and Open Intelligence

Intermediate
NVIDIA, : et al.Dec 24arXiv

Nemotron 3 is a new family of open AI models (Nano, Super, Ultra) built to think better while running faster and cheaper.

#Nemotron 3#Mixture-of-Experts#LatentMoE