The FACTS Leaderboard: A Comprehensive Benchmark for Large Language Model Factuality
BeginnerAileen Cheng, Alon Jacovi et al.Dec 11arXiv
The FACTS Leaderboard is a four-part test that checks how truthful AI models are across images, memory, web search, and document grounding.
#LLM factuality#benchmarking#multimodal evaluation