Learning Query-Specific Rubrics from Human Preferences for DeepResearch Report Generation
IntermediateChangze Lv, Jie Zhou et al.Feb 3arXiv
DeepResearch agents write long, evidence-based reports, but teaching and grading them is hard because there is no single 'right answer' to score against.
#DeepResearch#query-specific rubrics#human preference learning