Small AI models often stumble when a tool call fails and then get stuck repeating bad calls instead of fixing the mistake.
Diffusion language models can write tokens in any order, but that freedom can accidentally hurt their ability to reason well.
Robots used to explore by following simple rules or short-term rewards, which often made them waste time and backtrack a lot.
Think3D lets AI models stop guessing from flat pictures and start exploring real 3D space, like walking around a room in a video game.
This paper is the first big map of how AI can fix real software problems, not just write short code snippets.
STEP3-VL-10B is a small (10 billion parameters) open multimodal model that sees images and reads text, yet scores like much larger models.
This paper studies how AI agents that use tools talk about how sure they are and finds a split: some tools make them too sure, others help them be honest.
ArenaRL teaches AI agents by comparing their answers against each other, like a sports tournament, instead of giving each answer a single noisy score.
The paper teaches an AI to act like a careful traveler: it looks at a photo, forms guesses about where it might be, and uses real map tools to check each guess.
This paper teaches a camera to fix nighttime colors by combining a smart rule-based color trick (SGP-LRD) with a learning-by-trying helper (reinforcement learning).
This survey explains how AI judges are changing from single smart readers (LLM-as-a-Judge) into full-on agents that can plan, use tools, remember, and work in teams (Agent-as-a-Judge).
Long-term AI helpers remember past chats, but using all memories can trap them in old ideas (Memory Anchoring).