๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐ŸงฉProblems๐ŸŽฏPrompts๐Ÿง Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers2

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#rollouts

Self-Improving Pretraining: using post-trained models to pretrain better models

Intermediate
Ellen Xiaoqing Tan, Shehzaad Dhuliawala et al.Jan 29arXiv

This paper teaches language models to be safer, more factual, and higher quality during pretraining, not just after, by using reinforcement learning with a stronger model as a helper.

#self-improving pretraining#reinforcement learning#online DPO

Imagine-then-Plan: Agent Learning from Adaptive Lookahead with World Models

Intermediate
Youwei Liu, Jian Wang et al.Jan 13arXiv

Agents often act like tourists without a map: they react to what they see now and miss long-term consequences.

#Imagine-then-Plan#world models#adaptive lookahead