OmniGAIA is a new test that checks if AI can watch videos, look at images, listen to audio, and use web and code tools in several steps to find a verified answer.
This paper teaches robots to move their camera to a better spot before answering a question about what they see.