Papers8

#teacher forcing

World Action Models are Zero-shot Policies

Seonghyeon Ye, Yunhao Ge et al.Feb 17arXiv

DreamZero is a robot brain that learns actions by predicting short videos of the future and the matching moves at the same time.

#World Action Models#DreamZero#video diffusion

Not triaged yet

LIVE: Long-horizon Interactive Video World Modeling

Intermediate

Junchao Huang, Ziyang Ye et al.Feb 3arXiv

LIVE is a new way to train video-making AIs so their mistakes don’t snowball over long videos.

#cycle consistency#autoregressive video diffusion#exposure bias

Not triaged yet

Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation

Intermediate

Hongzhou Zhu, Min Zhao et al.Feb 2arXiv

The paper fixes a hidden mistake many fast video generators were making when turning a "see-everything" model into a "see-past-only" model.

#autoregressive video diffusion#causal attention#ODE distillation

Not triaged yet

VA-$π$: Variational Policy Alignment for Pixel-Aware Autoregressive Generation

Intermediate

Xinyao Liao, Qiyuan He et al.Dec 22arXiv

Autoregressive (AR) image models make pictures by choosing tokens one-by-one, but they were judged only on picking likely tokens, not on how good the final picture looks in pixels.

#autoregressive image generation#tokenizer–generator alignment#pixel-space reconstruction

Not triaged yet

End-to-End Training for Autoregressive Video Diffusion via Self-Resampling

Intermediate

Yuwei Guo, Ceyuan Yang et al.Dec 17arXiv

This paper fixes a common problem in video-making AIs where tiny mistakes snowball over time and ruin long videos.

#autoregressive video diffusion#exposure bias#teacher forcing

Not triaged yet

Causal Judge Evaluation: Calibrated Surrogate Metrics for LLM Systems

Intermediate

Eddie Landesberg, Manjari NarayanDec 11arXiv

LLM judges are cheap but biased; without calibration they can completely flip which model looks best.

#LLM-as-judge#calibration#isotonic regression

Not triaged yet

Rethinking Training Dynamics in Scale-wise Autoregressive Generation

Intermediate

Gengze Zhou, Chongjian Ge et al.Dec 6arXiv

This paper fixes two big problems in image-making AI that builds pictures step by step: it often practices with perfect answers (teacher forcing) but must perform using its own imperfect guesses later, and the earliest coarse steps are much harder than the later fine steps.

#visual autoregressive modeling#next-scale prediction#exposure bias

Not triaged yet

EtCon: Edit-then-Consolidate for Reliable Knowledge Editing

Intermediate

Ruilin Li, Yibin Wang et al.Dec 4arXiv

Large language models forget or misuse new facts if you only poke their weights once; EtCon fixes this with a two-step plan.

#knowledge editing#EtCon#TPSFT

Not triaged yet