๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐Ÿ“Daily Log๐ŸŽฏPrompts๐Ÿง Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers1

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#graph theory

QEDBENCH: Quantifying the Alignment Gap in Automated Evaluation of University-Level Mathematical Proofs

Intermediate
Santiago Gonzalez, Alireza Amiri Bavandpour et al.Feb 24arXiv

This paper shows that when AI models grade university-level math proofs, they often disagree with human experts in systematic ways.

#LLM-as-a-Judge#mathematical proof evaluation#alignment gap