Search-R2: Enhancing Search-Integrated Reasoning via Actor-Refiner Collaboration
IntermediateBowei He, Minda Hu et al.Feb 3arXiv
This paper teaches AI to look things up on the web and fix its own mistakes mid-thought instead of starting over from scratch.
#search-integrated reasoning#reinforcement learning#credit assignment