Papers13

#Pareto frontier

KARL: Knowledge Agents via Reinforcement Learning

Jonathan D. Chang, Andrew Drozdov et al.Mar 5arXiv

KARL is a smart search helper that learns to look up information step by step and explain answers using the facts it finds.

#grounded reasoning#enterprise search#reinforcement learning

Not triaged yet

Phi-4-reasoning-vision-15B Technical Report

Intermediate

Jyoti Aneja, Michael Harrison et al.Mar 4arXiv

Phi-4-reasoning-vision-15B is a small, open-weight AI that understands pictures and text together and is especially good at math, science, and using computer screens.

#multimodal reasoning#vision-language model#mid-fusion

Not triaged yet

SkillOrchestra: Learning to Route Agents via Skill Transfer

Beginner

Jiayu Wang, Yifei Ming et al.Feb 23arXiv

SkillOrchestra is a new way to make teams of AI models and tools work together by thinking in terms of skills, not just picking one big model for everything.

#agent orchestration#model routing#skill discovery

Not triaged yet

Decoding ML Decision: An Agentic Reasoning Framework for Large-Scale Ranking System

Intermediate

Longfei Yun, Yihan Wu et al.Feb 20arXiv

GEARS is a new way to improve big ranking systems (like what shows up first in your feed) by letting an AI agent explore options safely, instead of humans tweaking knobs by hand.

#GEARS#agentic ranking#Specialized Agent Skills

Not triaged yet

Parallel-Probe: Towards Efficient Parallel Thinking via 2D Probing

Intermediate

Tong Zheng, Chengsong Huang et al.Feb 3arXiv

Parallel-Probe is a simple add-on that lets many AI “thought paths” think at once but stop early when they already agree.

#parallel thinking#2D probing#consensus-based early stopping

Not triaged yet

Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization

Intermediate

Haocheng Xi, Shuo Yang et al.Feb 3arXiv

Auto-regressive video models make videos one chunk at a time but run out of GPU memory because the KV-cache grows with history.

#Quant VideoGen (QVG)#KV-cache quantization#2-bit quantization

Not triaged yet

Balancing Understanding and Generation in Discrete Diffusion Models

Intermediate

Yue Liu, Yuzhong Zhao et al.Feb 1arXiv

This paper introduces XDLM, a single model that blends two popular diffusion styles (masked and uniform) so it both understands and generates text and images well.

#XDLM#discrete diffusion#stationary noise kernel

Not triaged yet

ECO: Quantized Training without Full-Precision Master Weights

Intermediate

Mahdi Nikdan, Amir Zandieh et al.Jan 29arXiv

Training big AI models uses lots of memory because most methods still keep a secret full-precision copy of the weights called master weights.

#quantized training#master weights#error feedback

Not triaged yet

Toward Efficient Agents: Memory, Tool learning, and Planning

Intermediate

Xiaofang Yang, Lijun Li et al.Jan 20arXiv

This survey explains how to make AI agents not just smart, but also efficient with their time, memory, and tool use.

#agent efficiency#memory compression#tool learning

Not triaged yet

Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

Intermediate

Mike A. Merrill, Alexander G. Shaw et al.Jan 17arXiv

Terminal-Bench 2.0 is a tough test that checks how well AI agents can solve real, professional tasks by typing commands in a computer terminal.

#Terminal-Bench#command line interface#Docker containers

Not triaged yet

GlimpRouter: Efficient Collaborative Inference by Glimpsing One Token of Thoughts

Beginner

Wenhao Zeng, Xuteng Zhang et al.Jan 8arXiv

Big reasoning AIs think in many steps, which is slow and costly.

#collaborative inference#initial token entropy#step-level routing

Not triaged yet

ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models

Intermediate

Long Lian, Sida Wang et al.Nov 24arXiv

ThreadWeaver teaches a language model to split big problems into smaller parts it can solve at the same time, like teammates working in parallel.

#adaptive parallel reasoning#fork–join#threaded inference

Not triaged yet

1 2