How I Study AI - Learn AI Papers & Lectures the Easy Way

Reinforced Fast Weights with Next-Sequence Prediction

Hee Seung Hwang, Xindi Wu et al.Feb 18arXiv

Fast weight models remember context with a tiny, fixed memory, but standard next-token training teaches them to think only one word ahead.

#fast weight models#next-sequence prediction#reinforcement learning for LMs

Not triaged yet

ROI-Reasoning: Rational Optimization for Inference via Pre-Computation Meta-Cognition

Intermediate

Muyang Zhao, Qi Qi et al.Jan 7arXiv

The paper teaches AI models to plan their thinking time like a smart test-taker who has to finish several questions before the bell rings.

#meta-cognition#budgeted reasoning#token budget

Not triaged yet

Beyond Token-level Supervision: Unlocking the Potential of Decoding-based Regression via Reinforcement Learning

Intermediate

Ming Chen, Sheng Tang et al.Dec 6arXiv

The paper shows that making a model write a number as a sequence of digits and then grading the whole number at the end works better than grading each digit separately.

#decoding-based regression#sequence-level reward#reinforcement learning

Not triaged yet

Papers3

Reinforced Fast Weights with Next-Sequence Prediction

ROI-Reasoning: Rational Optimization for Inference via Pre-Computation Meta-Cognition

Beyond Token-level Supervision: Unlocking the Potential of Decoding-based Regression via Reinforcement Learning