How I Study AI - Learn AI Papers & Lectures the Easy Way

ARLArena: A Unified Framework for Stable Agentic Reinforcement Learning

Xiaoxuan Wang, Han Zhang et al.Feb 25arXiv

This paper tackles why training AI agents that act over many steps (like browsing the web or moving in a house) often becomes unstable and collapses.

#Agentic Reinforcement Learning#Policy Gradient#Sequence-level Clipping

Not triaged yet

KromHC: Manifold-Constrained Hyper-Connections with Kronecker-Product Residual Matrices

Intermediate

Wuyang Zhou, Yuxuan Gu et al.Jan 29arXiv

Hyper-Connections (HC) make the usual single shortcut in neural networks wider by creating several parallel streams and letting the model mix them, but this can become unstable when stacked deep.

#Hyper-Connections#Manifold-Constrained Hyper-Connections#Doubly Stochastic Matrix

Not triaged yet

AT$^2$PO: Agentic Turn-based Policy Optimization via Tree Search

Intermediate

Zefang Zong, Dingwei Chen et al.Jan 8arXiv

AT2PO is a new way to train AI agents that work in several turns, like asking the web a question, reading the result, and trying again.

#Agentic Reinforcement Learning#Turn-level Optimization#Tree Search

Not triaged yet

Papers3

ARLArena: A Unified Framework for Stable Agentic Reinforcement Learning

KromHC: Manifold-Constrained Hyper-Connections with Kronecker-Product Residual Matrices

AT$^2$PO: Agentic Turn-based Policy Optimization via Tree Search