🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
📝Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers30

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#LLM agents

LRAgent: Efficient KV Cache Sharing for Multi-LoRA LLM Agents

Intermediate
Hyesung Jeon, Hyeongju Ha et al.Feb 1arXiv

Multi-agent LLM systems often use LoRA adapters so each agent has a special role, but they all rebuild almost the same KV cache, wasting memory and time.

#LoRA#Multi-LoRA#KV cache

CAR-bench: Evaluating the Consistency and Limit-Awareness of LLM Agents under Real-World Uncertainty

Intermediate
Johannes Kirmayr, Lukas Stappen et al.Jan 29arXiv

CAR-bench is a new 'driving test' for AI assistants that checks if they can stay careful, honest, and consistent during real back-and-forth conversations in a car.

#LLM agents#benchmarking#consistency

Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives

Intermediate
Tengyue Xu, Zhuoyang Qian et al.Jan 28arXiv

Idea2Story is a two-stage system that first studies many accepted research papers offline and then uses that knowledge online to turn a vague idea into a full scientific plan.

#autonomous scientific discovery#knowledge graph#method unit extraction

Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

Intermediate
Mike A. Merrill, Alexander G. Shaw et al.Jan 17arXiv

Terminal-Bench 2.0 is a tough test that checks how well AI agents can solve real, professional tasks by typing commands in a computer terminal.

#Terminal-Bench#command line interface#Docker containers

PACEvolve: Enabling Long-Horizon Progress-Aware Consistent Evolution

Intermediate
Minghao Yan, Bo Peng et al.Jan 15arXiv

PACEvolve is a new recipe that helps AI agents improve their ideas step by step over long periods without getting stuck.

#evolutionary search#LLM agents#context management

Toward Ultra-Long-Horizon Agentic Science: Cognitive Accumulation for Machine Learning Engineering

Intermediate
Xinyu Zhu, Yuzhu Cai et al.Jan 15arXiv

This paper builds an AI agent, ML-Master 2.0, that can work on machine learning projects for a very long time without forgetting what matters.

#Hierarchical Cognitive Caching#cognitive accumulation#ultra-long-horizon autonomy

ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback

Intermediate
Yutao Mou, Zhangchi Xue et al.Jan 15arXiv

ToolSafe is a new way to keep AI agents safe when they use external tools, by checking each action before it runs.

#step-level safety#tool invocation#LLM agents

MAXS: Meta-Adaptive Exploration with LLM Agents

Intermediate
Jian Zhang, Zhiyuan Wang et al.Jan 14arXiv

MAXS is a new way for AI agents to think a few steps ahead while using tools like search and code, so they make smarter choices.

#LLM agents#tool-augmented reasoning#lookahead

Imagine-then-Plan: Agent Learning from Adaptive Lookahead with World Models

Intermediate
Youwei Liu, Jian Wang et al.Jan 13arXiv

Agents often act like tourists without a map: they react to what they see now and miss long-term consequences.

#Imagine-then-Plan#world models#adaptive lookahead

Beyond Static Tools: Test-Time Tool Evolution for Scientific Reasoning

Intermediate
Jiaxuan Lu, Ziyu Kong et al.Jan 12arXiv

This paper teaches AI to build and improve its own small computer helpers (tools) while solving science problems, instead of relying only on a fixed toolbox made beforehand.

#Test-Time Tool Evolution#Dynamic tool synthesis#Scientific reasoning

OpenTinker: Separating Concerns in Agentic Reinforcement Learning

Intermediate
Siqi Zhu, Jiaxuan YouJan 12arXiv

OpenTinker is an open-source system that makes training AI agents with reinforcement learning simple, modular, and reusable.

#Reinforcement learning#LLM agents#Agent–environment interaction

EnvScaler: Scaling Tool-Interactive Environments for LLM Agent via Programmatic Synthesis

Intermediate
Xiaoshuai Song, Haofei Chang et al.Jan 9arXiv

EnvScaler is an automatic factory that builds many safe, rule-following practice worlds where AI agents can talk to users and call tools, just like real apps.

#EnvScaler#tool-interactive environments#programmatic synthesis
123