How I Study AI - Learn AI Papers & Lectures the Easy Way

Spurious Rewards Paradox: Mechanistically Understanding How RLVR Activates Memorization Shortcuts in LLMs

Intermediate

Lecheng Yan, Ruizhe Li et al.Jan 16arXiv

The paper shows that when an LLM is trained with spurious (misleading) rewards in RLVR, it can score higher by memorizing answers instead of reasoning.

#RLVR#data contamination#memorization shortcuts

Papers1

Spurious Rewards Paradox: Mechanistically Understanding How RLVR Activates Memorization Shortcuts in LLMs