ToolPRMBench: Evaluating and Advancing Process Reward Models for Tool-using Agents
IntermediateDawei Li, Yuguang Yao et al.Jan 18arXiv
ToolPRMBench is a new benchmark that checks, step by step, whether an AI agent using tools picks the right next action.
#process reward model#tool-using agents#offline sampling