Compositional Generalization Requires Linear, Orthogonal Representations in Vision Embedding Models
IntermediateArnas Uselis, Andrea Dittadi et al.Feb 27arXiv
The paper asks a simple question: what must a vision modelβs internal pictures (embeddings) look like if it can recognize new mixes of things it already knows?
#compositional generalization#linear representation hypothesis#orthogonal representations