๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐Ÿ“Daily Log๐ŸŽฏPrompts๐Ÿง Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers16

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#robot manipulation

RoboTracer: Mastering Spatial Trace with Reasoning in Vision-Language Models for Robotics

Intermediate
Enshen Zhou, Cheng Chi et al.Dec 15arXiv

RoboTracer is a vision-language model that turns tricky, word-only instructions into safe, step-by-step 3D paths (spatial traces) robots can follow.

#RoboTracer#spatial trace#3D spatial referring

Not triaged yet

Spatial-Aware VLA Pretraining through Visual-Physical Alignment from Human Videos

Intermediate
Yicheng Feng, Wanpeng Zhang et al.Dec 15arXiv

Robots often see the world as flat pictures but must move in a 3D world, which makes accurate actions hard.

#Vision-Language-Action#3D spatial grounding#visual-physical alignment

Not triaged yet

FoundationMotion: Auto-Labeling and Reasoning about Spatial Movement in Videos

Intermediate
Yulu Gan, Ligeng Zhu et al.Dec 11arXiv

FoundationMotion is a fully automatic pipeline that turns raw videos into detailed motion data, captions, and quizzes about how things move.

#motion understanding#spatio-temporal reasoning#video question answering

Not triaged yet

LEO-RobotAgent: A General-purpose Robotic Agent for Language-driven Embodied Operator

Intermediate
Lihuang Chen, Xiangyu Luo et al.Dec 11arXiv

LEO-RobotAgent is a simple but powerful framework that lets a language model think, plan, and operate many kinds of robots using natural language.

#LEO-RobotAgent#language-driven robotics#LLM agent

Not triaged yet

12