๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐ŸงฉProblems๐ŸŽฏPrompts๐Ÿง Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers3

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#agentic LLMs

daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently

Intermediate
Mohan Jiang, Dayuan Fu et al.Feb 2arXiv

Long tasks trip up most AIs because they lose track of goals and make small mistakes that snowball over many steps.

#long-horizon agency#pull request chains#software evolution

Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem

Intermediate
Weixun Wang, XiaoXiao Xu et al.Dec 31arXiv

This paper builds an open, end-to-end ecosystem (ALE) that lets AI agents plan, act, and fix their own mistakes across many steps in real computer environments.

#agentic LLMs#reinforcement learning#IPA

Turn-PPO: Turn-Level Advantage Estimation with PPO for Improved Multi-Turn RL in Agentic LLMs

Intermediate
Junbo Li, Peng Zhou et al.Dec 18arXiv

Turn-PPO is a new way to train chatty AI agents that act over many steps, by judging each conversation turn as one whole action instead of judging every single token.

#Turn-PPO#multi-turn reinforcement learning#agentic LLMs