How I Study AI - Learn AI Papers & Lectures the Easy Way

Experiential Reinforcement Learning

Intermediate

Taiwei Shi, Sihao Chen et al.Feb 15arXiv

This paper teaches AI models to learn like good students: try, think about what went wrong, fix it, and remember the fix.

#Experiential Reinforcement Learning#self-reflection#distillation

Self-Hinting Language Models Enhance Reinforcement Learning

Intermediate

Baohao Liao, Hanze Dong et al.Feb 3arXiv

When rewards are rare, a popular training method for language models (GRPO) often stops learning because every try in a group gets the same score, so there is nothing to compare.

#reinforcement learning#GRPO#self-hinting

Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning

Intermediate

Seijin Kobayashi, Yanick Schimpf et al.Dec 23arXiv

The paper shows that big sequence models (like transformers) quietly learn longer goals inside their hidden activations, even though they are trained one step at a time.

#hierarchical reinforcement learning#temporal abstractions#autoregressive models

Papers3

Experiential Reinforcement Learning

Self-Hinting Language Models Enhance Reinforcement Learning

Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning