๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐Ÿ“Daily Log๐ŸŽฏPrompts๐Ÿง Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers3

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#Brier Score

Blockwise Advantage Estimation for Multi-Objective RL with Verifiable Rewards

Intermediate
Kirill Pavlenko, Alexander Golubev et al.Feb 10arXiv

The paper fixes a common mistake in training language models for multi-part tasks: giving the same reward signal to every token, even when different text parts aim at different goals.

#Blockwise Advantage Estimation#Outcome-Conditioned Baseline#Group Relative Policy Optimization

Agentic Uncertainty Quantification

Intermediate
Jiaxin Zhang, Prafulla Kumar Choubey et al.Jan 22arXiv

Long AI tasks can go wrong early and keep getting worse, like a snowball of mistakes called the Spiral of Hallucination.

#Agentic Uncertainty Quantification#Spiral of Hallucination#Dual-Process Architecture

EpiCaR: Knowing What You Don't Know Matters for Better Reasoning in LLMs

Intermediate
Jewon Yeom, Jaewon Sok et al.Jan 11arXiv

This paper teaches AI models not just how to solve problems but also how to tell when their own answers might be wrong.

#EPICAR#calibration#epistemic uncertainty