Papers4

#post-training

Fast weight models remember context with a tiny, fixed memory, but standard next-token training teaches them to think only one word ahead.

Not triaged yet

WorldCompass teaches video world models to follow actions better and keep pictures pretty by using reinforcement learning after pretraining.

Not triaged yet

This paper shows how to make text-to-video models create clearer, steadier, and more on-topic videos without using any human-labeled ratings.

Not triaged yet

Language models can act like many characters, but they usually aim to be a helpful Assistant after post-training.

Not triaged yet