UniG2U-Bench: Do Unified Models Advance Multimodal Understanding?
IntermediateZimo Wen, Boxiu Li et al.Mar 3arXiv
This paper builds UniG2U-Bench, a big test to find out when making pictures (generation) actually helps models understand pictures and text together.
#Unified multimodal models#Vision-language models#Generation-to-Understanding (G2U)