This paper builds a Google-for-theorems: a semantic search engine that finds exact theorems, lemmas, and propositions instead of just entire papers.
This paper teaches AI to look things up on the web and fix its own mistakes mid-thought instead of starting over from scratch.
MemSkill turns memory operations for AI agents into learnable skills instead of fixed, hand-made rules.
Mind-Brush turns image generation from a one-step 'read the prompt and draw' into a multi-step 'think, research, and create' process.
This paper builds a safe science “playground” called DeR that fairly tests how AI finds facts (retrieval) and how it thinks with those facts (reasoning) without mixing them up.
AgentLongBench is a new test that checks how well AI agents think over very long stories made of their own actions and the world's replies, not just by reading static documents.
This paper teaches a language-model agent to look up facts in millions of scientific paper summaries and answer clear, single-answer questions.
SAGE is a two-agent system that automatically writes tough, multi-step search questions and checks them by actually trying to solve them.
DRPG is a four-step AI helper that writes strong academic rebuttals by first breaking a review into parts, then fetching evidence, planning a strategy, and finally writing the response.
This paper says modern video generators are starting to act like tiny "world simulators," not just pretty video painters.
This survey explains how to make AI agents not just smart, but also efficient with their time, memory, and tool use.
The paper introduces M^4olGen, a two-stage system that designs new molecules to match exact numbers for several properties (like QED, LogP, MW, HOMO, LUMO) at the same time.