ArenaRL teaches AI agents by comparing their answers against each other, like a sports tournament, instead of giving each answer a single noisy score.
The paper teaches an AI to act like a careful traveler: it looks at a photo, forms guesses about where it might be, and uses real map tools to check each guess.
This paper teaches a camera to fix nighttime colors by combining a smart rule-based color trick (SGP-LRD) with a learning-by-trying helper (reinforcement learning).
This survey explains how AI judges are changing from single smart readers (LLM-as-a-Judge) into full-on agents that can plan, use tools, remember, and work in teams (Agent-as-a-Judge).
Long-term AI helpers remember past chats, but using all memories can trap them in old ideas (Memory Anchoring).
ATLAS is a system that picks the best mix of AI models and helper tools for each question, instead of using just one model or a fixed tool plan.
ThinkRL-Edit teaches an image editor to think first and draw second, which makes tricky, reasoning-heavy edits much more accurate.
Robust-R1 teaches vision-language models to notice how a picture is damaged, think through what that damage hides, and then answer as if the picture were clear.
Diffusion Preview is a two-step “preview-then-refine” workflow that shows you a fast draft image first and only spends full compute after you like the draft.
COOPER is a single AI model that both “looks better” (perceives depth and object boundaries) and “thinks smarter” (reasons step by step) to answer spatial questions about images.