RubricBench: Aligning Model-Generated Rubrics with Human Standards
IntermediateQiyuan Zhang, Junyi Zhou et al.Mar 2arXiv
RubricBench is a new benchmark that checks whether AI judges can use clear, checklist-style rules (rubrics) the way humans do.
#RubricBench#rubric-guided evaluation#reward models