๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐Ÿ“Daily Log๐ŸŽฏPrompts๐Ÿง Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers4

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#post-training

Reinforced Fast Weights with Next-Sequence Prediction

Intermediate
Hee Seung Hwang, Xindi Wu et al.Feb 18arXiv

Fast weight models remember context with a tiny, fixed memory, but standard next-token training teaches them to think only one word ahead.

#fast weight models#next-sequence prediction#reinforcement learning for LMs

WorldCompass: Reinforcement Learning for Long-Horizon World Models

Beginner
Zehan Wang, Tengfei Wang et al.Feb 9arXiv

WorldCompass teaches video world models to follow actions better and keep pictures pretty by using reinforcement learning after pretraining.

#world models#reinforcement learning#clip-level rollout

PISCES: Annotation-free Text-to-Video Post-Training via Optimal Transport-Aligned Rewards

Intermediate
Minh-Quan Le, Gaurav Mittal et al.Feb 2arXiv

This paper shows how to make text-to-video models create clearer, steadier, and more on-topic videos without using any human-labeled ratings.

#text-to-video#optimal transport#annotation-free

The Assistant Axis: Situating and Stabilizing the Default Persona of Language Models

Intermediate
Christina Lu, Jack Gallagher et al.Jan 15arXiv

Language models can act like many characters, but they usually aim to be a helpful Assistant after post-training.

#Assistant Axis#persona drift#activation capping