Papers3

#Spatial Reasoning

Language-based Trial and Error Falls Behind in the Era of Experience

Haoyu Wang, Guozheng Ma et al.Jan 29arXiv

Big language models are great at words but waste lots of time and energy when they try random actions in non-language games like Sudoku, Sokoban, 2048, FrozenLake, and Rubik’s Cube.

#SCOUT#Reinforcement Learning#Supervised Fine-Tuning

Not triaged yet

CoV: Chain-of-View Prompting for Spatial Reasoning

Intermediate

Haoyu Zhao, Akide Liu et al.Jan 8arXiv

This paper teaches AI to look around a 3D place step by step, instead of staring at a fixed set of pictures, so it can answer tricky spatial questions better.

#Chain-of-View Prompting#Embodied Question Answering#Active Viewpoint Reasoning

Not triaged yet

SpatialTree: How Spatial Abilities Branch Out in MLLMs

Intermediate

Yuxi Xiao, Longfei Li et al.Dec 23arXiv

SpatialTree is a new, four-level "ability tree" that tests how multimodal AI models (that see and read) handle space: from basic seeing to acting in the world.

#Spatial Intelligence#Multimodal Large Language Models#Hierarchical Benchmark

Not triaged yet