GARDO: Reinforcing Diffusion Models without Reward Hacking
IntermediateHaoran He, Yuxiao Ye et al.Dec 30arXiv
GARDO is a new way to fine-tune text-to-image diffusion models with reinforcement learning without getting tricked by bad reward signals.
#GARDO#reward hacking#gated KL regularization