Papers2

#parallel reasoning

$V_1$: Unifying Generation and Self-Verification for Parallel Reasoners

The paper shows that when a model compares two of its own answers head-to-head, it picks the right one more often than when it judges each answer alone.

#pairwise self-verification#test-time scaling#parallel reasoning

Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning

Beginner

Tong Wu, Yang Liu et al.Dec 8arXiv

This paper teaches a language model to think along several paths at the same time instead of one step after another.

#parallel reasoning#reinforcement learning for LLMs#self-distillation