MatchTIR: Fine-Grained Supervision for Tool-Integrated Reasoning via Bipartite Matching
IntermediateChangle Qu, Sunhao Dai et al.Jan 15arXiv
MatchTIR teaches AI agents to judge each tool call step-by-step instead of giving the same reward to every step.
#Tool-Integrated Reasoning#Credit Assignment#Bipartite Matching