πŸŽ“How I Study AIHISA
πŸ“–Read
πŸ“„PapersπŸ“°Blogs🎬Courses
πŸ’‘Learn
πŸ›€οΈPathsπŸ“šTopicsπŸ’‘Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers2

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#cross-modal reasoning

XR: Cross-Modal Agents for Composed Image Retrieval

Beginner
Zhongyu Yang, Wei Pang et al.Jan 20arXiv

XR is a new, training-free team of AI helpers that finds images using both a reference picture and a short text edit (like β€œsame jacket but red”).

#Composed Image Retrieval#cross-modal reasoning#multi-agent system

Urban Socio-Semantic Segmentation with Vision-Language Reasoning

Intermediate
Yu Wang, Yi Wang et al.Jan 15arXiv

Cities are full of places defined by people, like schools and parks, which are hard to see clearly from space without extra clues.

#socio-semantic segmentation#vision-language model#reinforcement learning