This paper teaches AI models to learn like good students: try, think about what went wrong, fix it, and remember the fix.
Most image search systems judge each photo by itself, which fails when clues are split across many photos taken over time.
EcoGym is a new open test playground where AI agents run small businesses over many days to see if they can plan well for the long term.
ProAct teaches AI agents to think ahead accurately without needing expensive search every time they act.
The paper tackles a common problem: people can ask AI to do big, complex tasks, but they can’t always explain exactly what they want or check the results well.
DeepSearchQA is a new test with 900 real-world style questions that checks if AI agents can find complete lists of answers, not just one fact.
DeepPlanning is a new benchmark that tests whether AI can make long, realistic plans that fit time and money limits.
Robots need videos that not only look pretty but also follow real-world physics and finish the task asked of them.
Fast-ThinkAct teaches a robot to plan with a few tiny hidden "thought tokens" instead of long paragraphs, making it much faster while staying smart.
The paper builds a new way to create realistic, long conversations between people and AI that use tools like databases.
ArenaRL teaches AI agents by comparing their answers against each other, like a sports tournament, instead of giving each answer a single noisy score.
Dream-VL and Dream-VLA use a diffusion language model backbone to understand images, talk about them, and plan actions better than many regular (autoregressive) models.