TreeGRPO: Tree-Advantage GRPO for Online RL Post-Training of Diffusion Models
IntermediateZheng Ding, Weirui YeDec 9arXiv
TreeGRPO teaches image generators using a smart branching tree so each training run produces many useful learning signals instead of just one.
#TreeGRPO#reinforcement learning#diffusion models