Beyond Length Scaling: Synergizing Breadth and Depth for Generative Reward Models
IntermediateQiyuan Zhang, Yufei Wang et al.Mar 2arXiv
Longer explanations are not always better; the shape of thinking matters.
#Generative Reward Models#Chain-of-Thought#Breadth-CoT