How I Study AI - Learn AI Papers & Lectures the Easy Way

Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation

Intermediate

Yanqi Dai, Yuxiang Ji et al.Jan 28arXiv

This paper says that to make math-solving AIs smarter, we should train them more on the hardest questions they can almost solve.

#Mathematical reasoning#RLVR#GRPO

PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

Intermediate

Jingcheng Hu, Yinmin Zhang et al.Jan 9arXiv

PaCoRe is a way for AI to think in many parallel paths and then coordinate them, so it can use a lot more brainpower at test time without running out of context window space.

#Parallel Coordinated Reasoning#Test-time compute scaling#Message passing

DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs

Intermediate

Shidong Cao, Hongzhan Lin et al.Jan 7arXiv

DiffCoT treats a model’s step-by-step thinking (Chain-of-Thought) like a messy draft that can be cleaned up over time, not something fixed forever.

#Chain-of-Thought#Diffusion models#Autoregressive decoding

Papers3

Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation

PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs