Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models
IntermediateXiaomin Yu, Yi Xin et al.Feb 2arXiv
This paper finds a precise way to describe and fix the Modality Gap, which is when image and text features that mean the same thing still sit in different places in the AIโs memory space.
#Modality Gap#Multimodal Large Language Models#Contrastive Learning