EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture
IntermediateXin He, Longhui Wei et al.Dec 4arXiv
EMMA is a single AI model that can understand images, write about them, create new images from text, and edit imagesβall in one unified system.
#EMMA#unified multimodal architecture#32x autoencoder