Papers3

#parametric knowledge

NanoKnow: How to Know What Your Language Model Knows

Lingwei Gu, Nour Jedidi et al.Feb 23arXiv

NanoKnow is a new benchmark that checks whether a language model’s answers come from what it saw during training or from extra text we give it at question time.

#NanoKnow#FineWeb-Edu#nanochat

Not triaged yet

Empty Shelves or Lost Keys? Recall Is the Bottleneck for Parametric Factuality

Intermediate

Nitay Calderon, Eyal Ben-David et al.Feb 15arXiv

Not all wrong answers from large language models (LLMs) mean they never learned the fact—many times the model knows it but can’t pull it out on demand.

#LLM factuality#encoding vs recall#knowledge profiling

Not triaged yet

The FACTS Leaderboard: A Comprehensive Benchmark for Large Language Model Factuality

Beginner

Aileen Cheng, Alon Jacovi et al.Dec 11arXiv

The FACTS Leaderboard is a four-part test that checks how truthful AI models are across images, memory, web search, and document grounding.

#LLM factuality#benchmarking#multimodal evaluation

Not triaged yet