Papers129

Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization

The paper teaches an AI to act like a careful traveler: it looks at a photo, forms guesses about where it might be, and uses real map tools to check each guess.

#image geolocalization#map-augmented agent#Thinking with Map

Same Claim, Different Judgment: Benchmarking Scenario-Induced Bias in Multilingual Financial Misinformation Detection

Beginner

Zhiwei Liu, Yupen Cao et al.Jan 8arXiv

This paper builds MFMD-Scen, a big test to see how AI changes its truth/false judgments about the same money-related claim when the situation around it changes.

#financial misinformation detection#scenario-induced bias#multilingual benchmark

RL-AWB: Deep Reinforcement Learning for Auto White Balance Correction in Low-Light Night-time Scenes

Beginner

Yuan-Kang Lee, Kuan-Lin Chen et al.Jan 8arXiv

This paper teaches a camera to fix nighttime colors by combining a smart rule-based color trick (SGP-LRD) with a learning-by-trying helper (reinforcement learning).

#auto white balance#color constancy#nighttime imaging

Re-Align: Structured Reasoning-guided Alignment for In-Context Image Generation and Editing

Beginner

Runze He, Yiji Cheng et al.Jan 8arXiv

Re-Align is a new way for AI to make and edit pictures by thinking in clear steps before drawing.

#In-Context Image Generation#Reference-based Image Editing#Structured Reasoning

Agent-as-a-Judge

Beginner

Runyang You, Hongru Cai et al.Jan 8arXiv

This survey explains how AI judges are changing from single smart readers (LLM-as-a-Judge) into full-on agents that can plan, use tools, remember, and work in teams (Agent-as-a-Judge).

#Agent-as-a-Judge#LLM-as-a-Judge#multi-agent collaboration

GlimpRouter: Efficient Collaborative Inference by Glimpsing One Token of Thoughts

Beginner

Wenhao Zeng, Xuteng Zhang et al.Jan 8arXiv

Big reasoning AIs think in many steps, which is slow and costly.

#collaborative inference#initial token entropy#step-level routing

Controllable Memory Usage: Balancing Anchoring and Innovation in Long-Term Human-Agent Interaction

Beginner

Muzhao Tian, Zisu Huang et al.Jan 8arXiv

Long-term AI helpers remember past chats, but using all memories can trap them in old ideas (Memory Anchoring).

#steerable memory#memory anchoring#long-term agents

Atlas: Orchestrating Heterogeneous Models and Tools for Multi-Domain Complex Reasoning

Beginner

Jinyang Wu, Guocheng Zhai et al.Jan 7arXiv

ATLAS is a system that picks the best mix of AI models and helper tools for each question, instead of using just one model or a fixed tool plan.

#ATLAS#LLM routing#tool augmentation

What Users Leave Unsaid: Under-Specified Queries Limit Vision-Language Models

Beginner

Dasol Choi, Guijin Son et al.Jan 7arXiv

Real people often ask vague questions with pictures, and today’s vision-language models (VLMs) struggle with them.

#vision-language models#under-specified queries#query explicitation

ThinkRL-Edit: Thinking in Reinforcement Learning for Reasoning-Centric Image Editing

Beginner

Hengjia Li, Liming Jiang et al.Jan 6arXiv

ThinkRL-Edit teaches an image editor to think first and draw second, which makes tricky, reasoning-heavy edits much more accurate.

#reasoning-centric image editing#reinforcement learning#chain-of-thought

Enhancing Linguistic Competence of Language Models through Pre-training with Language Learning Tasks

Beginner

Atsuki Yamaguchi, Maggie Mi et al.Jan 6arXiv

The paper teaches language models using extra 'language homework' made from the same raw text so they learn grammar and meaning, not just next-word guessing.

#language model pretraining#causal language modeling#linguistic competence

UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision

Beginner

Ruiyan Han, Zhen Fang et al.Jan 6arXiv

This paper fixes a common problem in multimodal AI: models can understand pictures and words well but stumble when asked to create matching images.

#Unified Multimodal Models#Self-Generated Supervision#Conduction Aphasia

4 5 6 7 8