Papers181

#GRPO

MindWatcher: Toward Smarter Multimodal Tool-Integrated Reasoning

Jiawei Chen, Xintian Shen et al.Dec 29arXiv

MindWatcher is a smart AI agent that can think step by step and decide when to use tools like web search, image zooming, and a code calculator to solve tough, multi-step problems.

#Tool-Integrated Reasoning#Interleaved Thinking#Multimodal Chain-of-Thought

Not triaged yet

Evaluating Parameter Efficient Methods for RLVR

Intermediate

Qingyu Yin, Yulun Wu et al.Dec 29arXiv

The paper asks which small, add-on training tricks (PEFT) work best when we teach language models with yes/no rewards we can check (RLVR).

#RLVR#parameter-efficient fine-tuning#LoRA

Not triaged yet

See Less, See Right: Bi-directional Perceptual Shaping For Multimodal Reasoning

Intermediate

Shuoshuo Zhang, Yizhen Zhang et al.Dec 26arXiv

The paper teaches vision-language models (AIs that look and read) to pay attention to the right picture parts without needing extra tools during answering time.

#BiPS#perceptual shaping#vision-language models

Not triaged yet

MAI-UI Technical Report: Real-World Centric Foundation GUI Agents

Intermediate

Hanzhang Zhou, Xu Zhang et al.Dec 26arXiv

MAI-UI is a family of AI agents that can see, understand, and control phone and computer screens using plain language.

#GUI agent#GUI grounding#mobile navigation

Not triaged yet

SmartSnap: Proactive Evidence Seeking for Self-Verifying Agents

Intermediate

Shaofei Cai, Yulei Qin et al.Dec 26arXiv

SmartSnap teaches an agent not only to finish a phone task but also to prove it with a few perfect snapshots it picks itself.

#Self-verifying agents#Evidence curation#3C principles

Not triaged yet

UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture

Intermediate

Shuo Cao, Jiayang Li et al.Dec 25arXiv

This paper teaches AI to notice not just what is in a picture, but how the picture looks and feels to people.

#perceptual image understanding#image aesthetics assessment (IAA)#image quality assessment (IQA)

Not triaged yet

NVIDIA Nemotron 3: Efficient and Open Intelligence

Intermediate

NVIDIA, : et al.Dec 24arXiv

Nemotron 3 is a new family of open AI models (Nano, Super, Ultra) built to think better while running faster and cheaper.

#Nemotron 3#Mixture-of-Experts#LatentMoE

Not triaged yet

LongVideoAgent: Multi-Agent Reasoning with Long Videos

Intermediate

Runtao Liu, Ziyi Liu et al.Dec 23arXiv

LongVideoAgent is a team of three AIs that work together to answer questions about hour‑long TV episodes without missing small details.

#long video question answering#multi-agent reasoning#temporal grounding

Not triaged yet

DiRL: An Efficient Post-Training Framework for Diffusion Language Models

Intermediate

Ying Zhu, Jiaxin Wan et al.Dec 23arXiv

This paper builds DiRL, a fast and careful way to finish training diffusion language models so they reason better.

#Diffusion Language Model#Blockwise dLLM#Post-Training

Not triaged yet

Multi-hop Reasoning via Early Knowledge Alignment

Intermediate

Yuxin Wang, Shicheng Fang et al.Dec 23arXiv

This paper adds a tiny but powerful step called Early Knowledge Alignment (EKA) to multi-step retrieval systems so the model takes a quick, smart look at relevant information before it starts planning.

#Retrieval-Augmented Generation#Iterative RAG#Multi-hop Reasoning

Not triaged yet

Memory-T1: Reinforcement Learning for Temporal Reasoning in Multi-session Agents

Intermediate

Yiming Du, Baojun Wang et al.Dec 23arXiv

Memory-T1 teaches chatty AI agents to keep track of when things happened across many conversations.

#temporal reasoning#multi-session dialogue#reinforcement learning

Not triaged yet

GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators

Intermediate

Jiacheng Guo, Ling Yang et al.Dec 22arXiv

GenEnv is a training system where a student AI and a teacher simulator grow together by exchanging tasks and feedback.

#GenEnv#co-evolutionary learning#difficulty-aligned curriculum

Not triaged yet

10 11 12 13 14