VAR RL Done Right: Tackling Asynchronous Policy Conflicts in Visual Autoregressive Generation
IntermediateShikun Sun, Liao Qu et al.Jan 5arXiv
Visual Autoregressive (VAR) models draw whole grids of image tokens at once across multiple scales, which makes standard reinforcement learning (RL) unstable.
#Visual Autoregressive (VAR)#Reinforcement Learning#GRPO