🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers3

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#multimodal agent

Think3D: Thinking with Space for Spatial Reasoning

Beginner
Zaibin Zhang, Yuhan Wu et al.Jan 19arXiv

Think3D lets AI models stop guessing from flat pictures and start exploring real 3D space, like walking around a room in a video game.

#Think3D#spatial reasoning#3D reconstruction

SenseNova-MARS: Empowering Multimodal Agentic Reasoning and Search via Reinforcement Learning

Intermediate
Yong Xien Chng, Tao Hu et al.Dec 30arXiv

SenseNova-MARS is a vision-language model that can think step-by-step and use three tools—text search, image search, and image cropping—during its reasoning.

#multimodal agent#vision-language model#reinforcement learning

Active Perception Agent for Omnimodal Audio-Video Understanding

Intermediate
Keda Tao, Wenjie Du et al.Dec 29arXiv

This paper introduces OmniAgent, a smart video-and-audio detective that actively decides when to listen and when to look.

#active perception#omnimodal understanding#audio-guided event localization