Most text-to-image models act like word-to-pixel copy machines and miss the hidden meaning in our prompts.
GARDO is a new way to fine-tune text-to-image diffusion models with reinforcement learning without getting tricked by bad reward signals.