🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers5

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#vLLM

Qwen3-ASR Technical Report

Intermediate
Xian Shi, Xiong Wang et al.Jan 29arXiv

Qwen3‑ASR is a family of speech models that hear, understand, and write down speech in 52 languages and dialects, plus they can tell you when each word was spoken.

#ASR#forced alignment#timestamps

FP8-RL: A Practical and Stable Low-Precision Stack for LLM Reinforcement Learning

Intermediate
Zhaopeng Qiu, Shuang Yu et al.Jan 26arXiv

The paper shows how to speed up reinforcement learning (RL) for large language models (LLMs) by making numbers smaller (FP8) without breaking training.

#FP8 quantization#LLM reinforcement learning#KV-cache

Jet-RL: Enabling On-Policy FP8 Reinforcement Learning with Unified Training and Rollout Precision Flow

Intermediate
Haocheng Xi, Charlie Ruan et al.Jan 20arXiv

Reinforcement learning (RL) for large language models is slow because the rollout (text generation) stage can take more than 70% of training time, especially for long, step-by-step answers.

#FP8 quantization#on-policy reinforcement learning#precision flow

SWAA: Sliding Window Attention Adaptation for Efficient Long-Context LLMs Without Pretraining

Intermediate
Yijiong Yu, Jiale Liu et al.Dec 11arXiv

Long texts make standard attention in large language models very slow because it checks every word against every other word.

#Sliding Window Attention#SWAA#FA Decode

ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models

Intermediate
Long Lian, Sida Wang et al.Nov 24arXiv

ThreadWeaver teaches a language model to split big problems into smaller parts it can solve at the same time, like teammates working in parallel.

#adaptive parallel reasoning#fork–join#threaded inference