Papers2

#group relative policy optimization

Unified Personalized Reward Model for Vision Generation

Yibin Wang, Yuhang Zang et al.Feb 2arXiv

The paper introduces UnifiedReward-Flex, a reward model that judges images and videos the way a thoughtful human would—by flexibly changing what it checks based on the prompt and the visual evidence.

#personalized reward model#multimodal reward#context-adaptive reasoning

Not triaged yet

Unified Thinker: A General Reasoning Modular Core for Image Generation

Intermediate

Sashuai Zhou, Qiang Zhou et al.Jan 6arXiv

Unified Thinker separates “thinking” (planning) from “drawing” (image generation) so complex instructions get turned into clear, doable steps before any pixels are painted.

#reasoning-aware image generation#structured planning#edit-only prompt

Not triaged yet