Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation
IntermediateYanqi Dai, Yuxiang Ji et al.Jan 28arXiv
This paper says that to make math-solving AIs smarter, we should train them more on the hardest questions they can almost solve.
#Mathematical reasoning#RLVR#GRPO