Papers5

#Pass@1

$V_1$: Unifying Generation and Self-Verification for Parallel Reasoners

The paper shows that when a model compares two of its own answers head-to-head, it picks the right one more often than when it judges each answer alone.

#pairwise self-verification#test-time scaling#parallel reasoning

Not triaged yet

OmniGAIA: Towards Native Omni-Modal AI Agents

Intermediate

Xiaoxi Li, Wenxiang Jiao et al.Feb 26arXiv

OmniGAIA is a new test that checks if AI can watch videos, look at images, listen to audio, and use web and code tools in several steps to find a verified answer.

#OmniGAIA#OmniAtlas#Tool-Integrated Reasoning

Not triaged yet

UCoder: Unsupervised Code Generation by Internal Probing of Large Language Models

Intermediate

Jiajun Wu, Jian Yang et al.Dec 19arXiv

The paper introduces UCoder, a way to teach a code-generating AI to get better without using any outside datasets, not even unlabeled code.

#unsupervised code generation#self-training#internal probing

Not triaged yet

JustRL: Scaling a 1.5B LLM with a Simple RL Recipe

Intermediate

Bingxiang He, Zekai Qu et al.Dec 18arXiv

JustRL shows that a tiny, steady recipe for reinforcement learning (RL) can make a 1.5B-parameter language model much better at math without fancy tricks.

#Reinforcement Learning#GRPO#Policy Entropy

Not triaged yet

Scaling Laws for Code: Every Programming Language Matters

Intermediate

Jian Yang, Shawn Guo et al.Dec 15arXiv

Different programming languages scale differently when training code AI models, so treating them all the same wastes compute and lowers performance.

#multilingual code pre-training#scaling laws#language-specific scaling

Not triaged yet