XR: Cross-Modal Agents for Composed Image Retrieval
BeginnerZhongyu Yang, Wei Pang et al.Jan 20arXiv
XR is a new, training-free team of AI helpers that finds images using both a reference picture and a short text edit (like βsame jacket but redβ).
#Composed Image Retrieval#cross-modal reasoning#multi-agent system