🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers6

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#LLM agents

RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System

Beginner
Yinjie Wang, Tianbao Xie et al.Feb 2arXiv

RLAnything is a new reinforcement learning (RL) framework that trains three things together at once: the policy (the agent), the reward model (the judge), and the environment (the tasks).

#reinforcement learning#closed-loop optimization#reward modeling

Paying Less Generalization Tax: A Cross-Domain Generalization Study of RL Training for LLM Agents

Beginner
Zhihan Liu, Lin Guan et al.Jan 26arXiv

LLM agents are usually trained in a few worlds but asked to work in many different, unseen worlds, which often hurts their performance.

#cross-domain generalization#state information richness#planning complexity

Agentic Confidence Calibration

Beginner
Jiaxin Zhang, Caiming Xiong et al.Jan 22arXiv

AI agents often act very sure of themselves even when they are wrong, especially on long, multi-step tasks.

#agentic confidence calibration#holistic trajectory calibration#general agent calibrator

The Confidence Dichotomy: Analyzing and Mitigating Miscalibration in Tool-Use Agents

Beginner
Weihao Xuan, Qingcheng Zeng et al.Jan 12arXiv

This paper studies how AI agents that use tools talk about how sure they are and finds a split: some tools make them too sure, others help them be honest.

#LLM agents#calibration#overconfidence

MemEvolve: Meta-Evolution of Agent Memory Systems

Beginner
Guibin Zhang, Haotian Ren et al.Dec 21arXiv

MemEvolve teaches AI agents not only to remember past experiences but also to improve the way they remember, like a student who upgrades their study habits over time.

#LLM agents#agent memory#meta-evolution

SCOPE: Prompt Evolution for Enhancing Agent Effectiveness

Beginner
Zehua Pei, Hui-Ling Zhen et al.Dec 17arXiv

SCOPE lets AI agents rewrite their own instructions while they are working, so they can fix mistakes and get smarter on the next step, not just the next task.

#prompt evolution#LLM agents#context management