FS-Researcher is a two-agent system that lets AI do very long research by saving everything in a computer folder so it never runs out of memory.
This paper builds a new test called AgentIF-OneDay that checks if AI helpers can follow everyday instructions the way people actually give them.
VERGE is a teamwork system where an AI writer (an LLM) works with a strict math checker (an SMT solver) to make answers both smart and logically sound.
The paper proposes Diffusion in Diffusion, a draft-then-revise method that brings back global coherence to fast, block-based diffusion language models.
The paper asks what a truly good diffusion-based language model should look like and lists five must-have properties.