A-RAG lets the AI choose how to search, what to read, and when to stop, instead of following a fixed recipe.
WildGraphBench is a new test that checks how well GraphRAG systems find and combine facts from messy, real-world web pages.
CatRAG is a new way for AI to find the right facts by letting the knowledge graph change its paths based on each question.
AACR-Bench is a new test set that checks how well AI can do code reviews using the whole project, not just one file.
The paper builds special Turkish legal AI models called Mecellem by teaching them from the ground up and then giving them more law-focused lessons.
This survey explains how large language models (LLMs) can clean, connect, and enrich messy data so it’s ready for real apps like dashboards, fraud detection, and training AI.
This paper explains how to turn large language models (LLMs) from quiet students that only answer questions into active agents that can plan, act, and learn over time.
This paper introduces MMDeepResearch-Bench (MMDR-Bench), a new test that checks how well AI “deep research agents” write long, citation-rich reports using both text and images.
The paper studies why large language models (LLMs) sound too sure of themselves when using retrieval-augmented generation (RAG) and how to fix it.
OpenDecoder teaches large language models (LLMs) to pay more attention to better documents during Retrieval-Augmented Generation (RAG).
This paper introduces PCED, a way to use many documents as separate 'experts' in parallel so an AI can stitch answers together without stuffing everything into one giant prompt.
ViDoRe V3 is a big, carefully built test that checks how well AI systems find and use information from both text and pictures (like tables and charts) in real documents.