The paper teaches an AI to act like a careful traveler: it looks at a photo, forms guesses about where it might be, and uses real map tools to check each guess.
This paper builds MFMD-Scen, a big test to see how AI changes its truth/false judgments about the same money-related claim when the situation around it changes.
This paper teaches a camera to fix nighttime colors by combining a smart rule-based color trick (SGP-LRD) with a learning-by-trying helper (reinforcement learning).
Re-Align is a new way for AI to make and edit pictures by thinking in clear steps before drawing.
This survey explains how AI judges are changing from single smart readers (LLM-as-a-Judge) into full-on agents that can plan, use tools, remember, and work in teams (Agent-as-a-Judge).
Big reasoning AIs think in many steps, which is slow and costly.
Long-term AI helpers remember past chats, but using all memories can trap them in old ideas (Memory Anchoring).
ATLAS is a system that picks the best mix of AI models and helper tools for each question, instead of using just one model or a fixed tool plan.
Real people often ask vague questions with pictures, and today’s vision-language models (VLMs) struggle with them.
ThinkRL-Edit teaches an image editor to think first and draw second, which makes tricky, reasoning-heavy edits much more accurate.
The paper teaches language models using extra 'language homework' made from the same raw text so they learn grammar and meaning, not just next-word guessing.
This paper fixes a common problem in multimodal AI: models can understand pictures and words well but stumble when asked to create matching images.