๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐ŸงฉProblems๐ŸŽฏPrompts๐Ÿง Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers3

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#open-vocabulary detection

ObjEmbed: Towards Universal Multimodal Object Embeddings

Intermediate
Shenghao Fu, Yukun Su et al.Feb 2arXiv

ObjEmbed teaches an AI to understand not just whole pictures, but each object inside them, and to link those objects to the right words.

#object embeddings#IoU embedding#visual grounding

A4-Agent: An Agentic Framework for Zero-Shot Affordance Reasoning

Intermediate
Zixin Zhang, Kanghao Chen et al.Dec 16arXiv

This paper builds A4-Agent, a smart three-part helper that figures out where to touch or use an object just from a picture and a written instruction, without any extra training.

#affordance prediction#zero-shot learning#vision-language models

FoundationMotion: Auto-Labeling and Reasoning about Spatial Movement in Videos

Intermediate
Yulu Gan, Ligeng Zhu et al.Dec 11arXiv

FoundationMotion is a fully automatic pipeline that turns raw videos into detailed motion data, captions, and quizzes about how things move.

#motion understanding#spatio-temporal reasoning#video question answering