How I Study AI - Learn AI Papers & Lectures the Easy Way

Diversity or Precision? A Deep Dive into Next Token Prediction

The paper shows that teaching a language model with a special “reward-shaped” next-token objective can make later reinforcement learning (RL) work much better.

#next-token prediction#cross-entropy as policy gradient#reward shaping

Not triaged yet

VA-$π$: Variational Policy Alignment for Pixel-Aware Autoregressive Generation

Intermediate

Xinyao Liao, Qiyuan He et al.Dec 22arXiv

Autoregressive (AR) image models make pictures by choosing tokens one-by-one, but they were judged only on picking likely tokens, not on how good the final picture looks in pixels.

#autoregressive image generation#tokenizer–generator alignment#pixel-space reconstruction

Not triaged yet

EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture

Intermediate

Xin He, Longhui Wei et al.Dec 4arXiv

EMMA is a single AI model that can understand images, write about them, create new images from text, and edit images—all in one unified system.

#EMMA#unified multimodal architecture#32x autoencoder

Not triaged yet

Papers3

Diversity or Precision? A Deep Dive into Next Token Prediction

VA-$π$: Variational Policy Alignment for Pixel-Aware Autoregressive Generation

EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture