Papers39

#retrieval-augmented generation

DARE: Aligning LLM Agents with the R Statistical Ecosystem via Distribution-Aware Retrieval

DARE is a new way for AI assistants to find the right R functions by also looking at what the data looks like, not just the words in the question.

#distribution-aware retrieval#RPKB#RCodingAgent

Not triaged yet

Legal RAG Bench: an end-to-end benchmark for legal RAG

Beginner

Abdur-Rahman Butler, Umar ButlerMar 2arXiv

Legal RAG Bench is a new, end-to-end test that checks how well legal AI systems find information and use it to answer tough, real-world legal questions.

#legal RAG#retrieval-augmented generation#embedding models

Not triaged yet

SciDER: Scientific Data-centric End-to-end Researcher

Beginner

Ke Lin, Yilin Lu et al.Mar 2arXiv

SciDER is a team of smart AI helpers that can run almost the whole research process: think of ideas, read raw data, write and run code, and improve itself with feedback.

#data-centric AI#AI research agent#self-evolving memory

Not triaged yet

NanoKnow: How to Know What Your Language Model Knows

Beginner

Lingwei Gu, Nour Jedidi et al.Feb 23arXiv

NanoKnow is a new benchmark that checks whether a language model’s answers come from what it saw during training or from extra text we give it at question time.

#NanoKnow#FineWeb-Edu#nanochat

Not triaged yet

Panini: Continual Learning in Token Space via Structured Memory

Intermediate

Shreyas Rajesh, Pavan Holur et al.Feb 16arXiv

Panini is a way for AI to keep learning new facts without changing its brain by storing them as tiny linked Q&A facts in an external memory.

#non-parametric continual learning#structured memory#Generative Semantic Workspace

Not triaged yet

AgentCPM-Report: Interleaving Drafting and Deepening for Open-Ended Deep Research

Intermediate

Yishan Li, Wentong Chen et al.Feb 6arXiv

This paper teaches small, local AI models to write deep, insightful research reports by letting writing and planning work together instead of staying separate.

#AgentCPM-Report#WARP#Writing As Reasoning Policy

Not triaged yet

Semantic Search over 9 Million Mathematical Theorems

Intermediate

Luke Alexander, Eric Leonen et al.Feb 5arXiv

This paper builds a Google-for-theorems: a semantic search engine that finds exact theorems, lemmas, and propositions instead of just entire papers.

#semantic theorem search#mathematical information retrieval#dense retrieval

Not triaged yet

Search-R2: Enhancing Search-Integrated Reasoning via Actor-Refiner Collaboration

Intermediate

Bowei He, Minda Hu et al.Feb 3arXiv

This paper teaches AI to look things up on the web and fix its own mistakes mid-thought instead of starting over from scratch.

#search-integrated reasoning#reinforcement learning#credit assignment

Not triaged yet

MemSkill: Learning and Evolving Memory Skills for Self-Evolving Agents

Intermediate

Haozhen Zhang, Quanyu Long et al.Feb 2arXiv

MemSkill turns memory operations for AI agents into learnable skills instead of fixed, hand-made rules.

#memory skills#LLM agents#skill bank

Not triaged yet

Mind-Brush: Integrating Agentic Cognitive Search and Reasoning into Image Generation

Intermediate

Jun He, Junyan Ye et al.Feb 2arXiv

Mind-Brush turns image generation from a one-step 'read the prompt and draw' into a multi-step 'think, research, and create' process.

#agentic image generation#multimodal reasoning#retrieval-augmented generation

Not triaged yet

PaperBanana: Automating Academic Illustration for AI Scientists

Beginner

Dawei Zhu, Rui Meng et al.Jan 30arXiv

PaperBanana is a team of AI helpers that turns a paper’s method text and caption into a clean, accurate, publication-ready figure.

#academic illustration#methodology diagrams#visual language models

Not triaged yet

Retrieval-Infused Reasoning Sandbox: A Benchmark for Decoupling Retrieval and Reasoning Capabilities

Intermediate

Shuangshuang Ying, Zheyu Wang et al.Jan 29arXiv

This paper builds a safe science “playground” called DeR that fairly tests how AI finds facts (retrieval) and how it thinks with those facts (reasoning) without mixing them up.

#retrieval-augmented generation#document-grounded reasoning#deep research benchmark

Not triaged yet

1 2 3 4