Visual Generation Tuning
IntermediateJiahao Guo, Sinan Du et al.Nov 28arXiv
Before this work, big vision-language models (VLMs) were great at understanding pictures and words together but not at making new pictures.
#Visual Generation Tuning#VGT-AE#Vision-Language Models