🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers906

AllBeginnerIntermediateAdvanced
All SourcesarXiv

Quartet II: Accurate LLM Pre-Training in NVFP4 by Improved Unbiased Gradient Estimation

Advanced
Andrei Panferov, Erik Schultheis et al.Jan 30arXiv

This paper shows how to train big language models faster and cheaper by using 4-bit numbers (NVFP4) without losing much accuracy.

#NVFP4#FP4 training#quantization-aware training

VisionTrim: Unified Vision Token Compression for Training-Free MLLM Acceleration

Intermediate
Hanxun Yu, Wentong Li et al.Jan 30arXiv

VisionTrim makes picture-and-text AI models run much faster by keeping only the most useful visual pieces (tokens) and smartly merging the rest.

#vision token compression#training-free acceleration#multimodal large language model

Pushing the Boundaries of Natural Reasoning: Interleaved Bonus from Formal-Logic Verification

Intermediate
Chuxue Cao, Jinluan Yang et al.Jan 30arXiv

Large language models sometimes reach the right answer for the wrong reasons, which is risky and confusing.

#formal logic verification#interleaved verification#neuro-symbolic reasoning

Statistical Estimation of Adversarial Risk in Large Language Models under Best-of-N Sampling

Intermediate
Mingqian Feng, Xiaodong Liu et al.Jan 30arXiv

Real attackers can try many prompts in parallel until a model slips, so testing safety with only one try badly underestimates risk.

#Best-of-N sampling#Adversarial risk#Attack Success Rate (ASR)

TTCS: Test-Time Curriculum Synthesis for Self-Evolving

Intermediate
Chengyi Yang, Zhishang Xiang et al.Jan 30arXiv

TTCS is a way for a model to teach itself during the test by first making easier practice questions that are similar to the real hard question and then learning from them.

#test-time training#test-time reinforcement learning#curriculum learning

Rethinking LLM-as-a-Judge: Representation-as-a-Judge with Small Language Models via Semantic Capacity Asymmetry

Intermediate
Zhuochun Li, Yong Zhang et al.Jan 30arXiv

Big models are often used to grade AI answers, but they are expensive, slow, and depend too much on tricky prompts.

#Representation-as-a-Judge#Semantic Capacity Asymmetry#LLM-as-a-Judge

SSL: Sweet Spot Learning for Differentiated Guidance in Agentic Optimization

Beginner
Jinyang Wu, Changpeng Yang et al.Jan 30arXiv

Most reinforcement learning agents only get a simple pass/fail reward, which hides how good or bad their attempts really were.

#Sweet Spot Learning#tiered rewards#reinforcement learning with verifiable rewards

RAPTOR: Ridge-Adaptive Logistic Probes

Intermediate
Ziqi Gao, Yaotian Zhu et al.Jan 29arXiv

RAPTOR is a simple, fast way to find a direction (a concept vector) inside a frozen language model that points toward a concept like 'sarcasm' or 'positivity.'

#probing#concept vectors#activation steering

One-step Latent-free Image Generation with Pixel Mean Flows

Beginner
Yiyang Lu, Susie Lu et al.Jan 29arXiv

This paper shows how to make a whole picture in one go, directly in pixels, without using a hidden “latent” space or many tiny steps.

#pixel MeanFlow#one-step generation#x-prediction

Discovering Hidden Gems in Model Repositories

Intermediate
Jonathan Kahana, Eliahu Horwitz et al.Jan 29arXiv

Millions of public AI models exist, but downloads are concentrated on a tiny set of “official” checkpoints, which are not always the best performers.

#hidden gems#model repositories#model trees

Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts

Intermediate
Yingfa Chen, Zhen Leng Thai et al.Jan 29arXiv

This paper shows how to turn a big Transformer model into a faster hybrid model that mixes attention and RNN layers using far less training data (about 2.3B tokens).

#hybrid attention#RNN attention hybrid#linear attention

Exploring Reasoning Reward Model for Agents

Intermediate
Kaixuan Fan, Kaituo Feng et al.Jan 29arXiv

The paper teaches AI agents better by grading not just their final answers, but also how they think and use tools along the way.

#Agentic Reinforcement Learning#Reasoning Reward Model#Process Supervision
1112131415