Legal RAG Bench is a new, end-to-end test that checks how well legal AI systems find information and use it to answer tough, real-world legal questions.
The paper shows that language models with a search tool often look up too much information, which wastes compute and can make answers worse on unanswerable questions.