This paper shows that letting an AI search many places at the same time (in parallel) can beat making it think in long, slow chains.
Agentic-R is a new way to teach a search retriever to find not just similar text, but the text that truly helps an AI get the final answer right.
RL-trained search agents often sound confident even when they donβt know, which can mislead people.