🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers6

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#Qwen3

DFlash: Block Diffusion for Flash Speculative Decoding

Intermediate
Jian Chen, Yesheng Liang et al.Feb 5arXiv

DFlash is a new way to make big language models answer much faster without changing the final answers.

#DFlash#speculative decoding#diffusion language model

THINKSAFE: Self-Generated Safety Alignment for Reasoning Models

Intermediate
Seanie Lee, Sangwoo Park et al.Jan 30arXiv

Large reasoning models got very good at thinking step-by-step, but that sometimes made them too eager to follow harmful instructions.

#THINKSAFE#self-generated safety alignment#refusal steering

Fast KVzip: Efficient and Accurate LLM Inference with Gated KV Eviction

Intermediate
Jang-Hyun Kim, Dongyoon Han et al.Jan 25arXiv

Fast KVzip is a new way to shrink an LLM’s memory (the KV cache) while keeping answers just as accurate.

#KV cache compression#gated KV eviction#sink attention

Mecellem Models: Turkish Models Trained from Scratch and Continually Pre-trained for the Legal Domain

Intermediate
Özgür Uğur, Mahmut Göksu et al.Jan 22arXiv

The paper builds special Turkish legal AI models called Mecellem by teaching them from the ground up and then giving them more law-focused lessons.

#Turkish legal NLP#ModernBERT#Continual pre-training

Nemotron-Math: Efficient Long-Context Distillation of Mathematical Reasoning from Multi-Mode Supervision

Intermediate
Wei Du, Shubham Toshniwal et al.Dec 17arXiv

Nemotron-Math is a giant math dataset with 7.5 million step-by-step solutions created in three thinking styles and with or without Python help.

#mathematical reasoning#long-context fine-tuning#multi-mode supervision

Efficient-DLM: From Autoregressive to Diffusion Language Models, and Beyond in Speed

Intermediate
Yonggan Fu, Lexington Whalen et al.Dec 16arXiv

Autoregressive (AR) models write one word at a time, which is accurate but slow, especially when your computer or GPU can’t keep many tasks in memory at once.

#diffusion language models#autoregressive models#AR-to-dLM conversion