A-RAG lets the AI choose how to search, what to read, and when to stop, instead of following a fixed recipe.
AACR-Bench is a new test set that checks how well AI can do code reviews using the whole project, not just one file.