Papers2

#Spatial Intelligence

Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training for Autonomous Systems

Song Wang, Lingdong Kong et al.Dec 30arXiv

Robots like cars and drones see the world with many different sensors (cameras, LiDAR, radar, and even event cameras), and this paper shows a clear roadmap for teaching them to understand space by learning from all of these together.

#Spatial Intelligence#Multi-Modal Pre-Training#Self-Supervised Learning

Not triaged yet

SpatialTree: How Spatial Abilities Branch Out in MLLMs

Intermediate

Yuxi Xiao, Longfei Li et al.Dec 23arXiv

SpatialTree is a new, four-level "ability tree" that tests how multimodal AI models (that see and read) handle space: from basic seeing to acting in the world.

#Spatial Intelligence#Multimodal Large Language Models#Hierarchical Benchmark

Not triaged yet