Papers791

Memory Matters More: Event-Centric Memory as a Logic Map for Agent Searching and Reasoning

Yuyang Hu, Jiongnan Liu et al.Jan 8arXiv

This paper turns an AI agent’s memory from a flat list of notes into a logic map of events connected by cause-and-time links.

#event-centric memory#Event Graph#logic-aware retrieval

Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking

Intermediate

Mingxin Li, Yanzhao Zhang et al.Jan 8arXiv

This paper builds two teamwork models, Qwen3-VL-Embedding and Qwen3-VL-Reranker, that understand text, images, visual documents, and videos in one shared space so search works across all of them.

#multimodal retrieval#unified embedding space#cross-encoder reranker

TourPlanner: A Competitive Consensus Framework with Constraint-Gated Reinforcement Learning for Travel Planning

Intermediate

Yinuo Wang, Mining Tan et al.Jan 8arXiv

TourPlanner is a travel-planning system that first gathers the right places, then lets multiple expert ‘voices’ debate plans, and finally polishes the winner with a learning method that follows rules before style.

#travel planning#multi-agent reasoning#chain-of-thought

Aligning Text, Code, and Vision: A Multi-Objective Reinforcement Learning Framework for Text-to-Visualization

Intermediate

Mizanur Rahman, Mohammed Saidul Islam et al.Jan 8arXiv

This paper teaches a model to turn a question about a table into both a short answer and a clear, correct chart.

#Text-to-Visualization#Reinforcement Learning#GRPO

Scaling Behavior Cloning Improves Causal Reasoning: An Open Model for Real-Time Video Game Playing

Intermediate

Yuguang Yue, Irakli Salia et al.Jan 8arXiv

The paper teaches a game-playing AI to copy good human players (behavior cloning) and shows that simply scaling up the model and the data makes the AI reason more causally (it pays attention to what truly causes outcomes on screen).

#behavior cloning#causal reasoning#causal confusion

TCAndon-Router: Adaptive Reasoning Router for Multi-Agent Collaboration

Intermediate

Jiuzhou Zhao, Chunrong Chen et al.Jan 8arXiv

Multi-agent systems are like teams of expert helpers; the tricky part is choosing which helpers to ask for each question.

#multi-agent systems#routing#reasoning chain

Choreographing a World of Dynamic Objects

Intermediate

Yanzhe Lyu, Chen Geng et al.Jan 7arXiv

CHORD is a new way to animate 3D scenes over time (4D) where many objects move and interact, guided only by a text prompt.

#4D generation#Rectified Flow#Score Distillation Sampling

Agentic Rubrics as Contextual Verifiers for SWE Agents

Intermediate

Mohit Raghavendra, Anisha Gunjal et al.Jan 7arXiv

The paper introduces Agentic Rubrics, a new way to check code fixes without running the code by creating a smart checklist from the project itself.

#Agentic Rubrics#Software Engineering Agents#Execution-free Verification

Apollo: Unified Multi-Task Audio-Video Joint Generation

Intermediate

Jun Wang, Chunyu Qiang et al.Jan 7arXiv

APOLLO is a single, unified model that can make video and audio together or separately, and it keeps them tightly in sync.

#audio-video generation#multimodal diffusion#single-tower transformer

Benchmark^2: Systematic Evaluation of LLM Benchmarks

Intermediate

Qi Qian, Chengsong Huang et al.Jan 7arXiv

Everyone uses tests (benchmarks) to judge how smart AI models are, but not all tests are good tests.

#LLM evaluation#benchmark quality#ranking consistency

FocusUI: Efficient UI Grounding via Position-Preserving Visual Token Selection

Intermediate

Mingyu Ouyang, Kevin Qinghong Lin et al.Jan 7arXiv

FOCUSUI makes computer-using AI faster and still accurate by looking only at the important parts of a screen.

#UI grounding#vision-language models#visual token pruning

ROI-Reasoning: Rational Optimization for Inference via Pre-Computation Meta-Cognition

Intermediate

Muyang Zhao, Qi Qi et al.Jan 7arXiv

The paper teaches AI models to plan their thinking time like a smart test-taker who has to finish several questions before the bell rings.

#meta-cognition#budgeted reasoning#token budget

35 36 37 38 39