Truncated Step-Level Sampling with Process Rewards for Retrieval-Augmented Reasoning
BeginnerChris Samarinas, Haw-Shiuan Chang et al.Feb 26arXiv
SLATE is a new way to teach AI to think step by step while using a search engine, giving feedback at each step instead of only at the end.
#retrieval-augmented reasoning#reinforcement learning#GRPO