From Perception to Action: An Interactive Benchmark for Vision Reasoning
BeginnerYuhao Wu, Maojia Song et al.Feb 24arXiv
The paper introduces CHAIN, a hands-on 3D playground that tests if AI can not only see objects but also plan and act under real physics.
#interactive benchmark#vision-language models#physical reasoning