How I Study AI - Learn AI Papers & Lectures the Easy Way

$V_1$: Unifying Generation and Self-Verification for Parallel Reasoners

Intermediate

Harman Singh, Xiuyu Li et al.Mar 4arXiv

The paper shows that when a model compares two of its own answers head-to-head, it picks the right one more often than when it judges each answer alone.

#pairwise self-verification#test-time scaling#parallel reasoning

Scalable Power Sampling: Unlocking Efficient, Training-Free Reasoning for LLMs via Distribution Sharpening

Intermediate

Xiaotong Ji, Rasul Tutunov et al.Jan 29arXiv

The paper shows a fast, training-free way to boost an LLM’s step-by-step reasoning by smartly reusing the model’s own probabilities.

#power distribution sampling#distribution sharpening#low-temperature sampling

Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability

Intermediate

Shobhita Sundaram, John Quan et al.Jan 26arXiv

This paper teaches a model to be its own teacher so it can climb out of a learning plateau on very hard math problems.

#meta-reinforcement learning#teacher-student self-play#grounded rewards

Papers3

$V_1$: Unifying Generation and Self-Verification for Parallel Reasoners

Scalable Power Sampling: Unlocking Efficient, Training-Free Reasoning for LLMs via Distribution Sharpening

Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability