AgentVista is a new test (benchmark) that checks whether AI agents can solve tough, real-life picture-based problems by using multiple tools over many steps.
SenseNova-MARS is a vision-language model that can think step-by-step and use three tools—text search, image search, and image cropping—during its reasoning.