Papers7

All Beginner Intermediate Advanced

All Sources arXiv

#long-horizon tasks

Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

Beginner

Zhenting Wang, Huancheng Chen et al.Mar 4arXiv

This paper teaches long-horizon AI agents to remember everything exactly without stuffing their whole memory at once.

#indexed memory#LLM agents#long-horizon tasks

Not triaged yet

LongCLI-Bench: A Preliminary Benchmark and Study for Long-horizon Agentic Programming in Command-Line Interfaces

Beginner

Yukang Feng, Jianwen Sun et al.Feb 15arXiv

LongCLI-Bench is a new test that checks how well AI coding agents can handle long, realistic software projects in the command line, not just tiny coding puzzles.

#LongCLI-Bench#agentic programming#command-line interface agents

Not triaged yet

AOrchestra: Automating Sub-Agent Creation for Agentic Orchestration

Beginner

Jianhao Ruan, Zhihao Xu et al.Feb 3arXiv

AOrchestra is like a smart conductor that builds the right mini-helpers (sub-agents) on demand to solve big, multi-step tasks.

#agent orchestration#sub-agent-as-tools#four-tuple abstraction

Not triaged yet

CUA-Skill: Develop Skills for Computer Using Agent

Intermediate

Tianyi Chen, Yinheng Li et al.Jan 28arXiv

This paper builds a big, reusable library of computer skills so an AI can use Windows apps more like a careful human, not a clumsy robot.

#computer-using agents#desktop automation#skill library

Not triaged yet

Spark: Strategic Policy-Aware Exploration via Dynamic Branching for Long-Horizon Agentic Learning

Intermediate

Jinyang Wu, Shuo Yang et al.Jan 28arXiv

SPARK is a new way to train AI agents that saves compute by exploring more only at the most important moments.

#SPARK#dynamic branching#strategic exploration

Not triaged yet

OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agent

Intermediate

Bowen Yang, Kaiming Jin et al.Jan 12arXiv

Computer-using agents kept forgetting important visual details over long tasks and could not reliably find up-to-date, step-by-step help for unfamiliar apps.

#computer-using agents#vision-language models#milestone memory

Not triaged yet

EVOLVE-VLA: Test-Time Training from Environment Feedback for Vision-Language-Action Models

Intermediate

Zechen Bai, Chen Gao et al.Dec 16arXiv

Robots usually learn by copying many demonstrations, which is expensive and makes them brittle when things change.

#EVOLVE-VLA#test-time training#vision-language-action

Not triaged yet