Papers10

#Pareto frontier

Phi-4-reasoning-vision-15B Technical Report

Jyoti Aneja, Michael Harrison et al.Mar 4arXiv

Phi-4-reasoning-vision-15B is a small, open-weight AI that understands pictures and text together and is especially good at math, science, and using computer screens.

#multimodal reasoning#vision-language model#mid-fusion

Decoding ML Decision: An Agentic Reasoning Framework for Large-Scale Ranking System

Intermediate

Longfei Yun, Yihan Wu et al.Feb 20arXiv

GEARS is a new way to improve big ranking systems (like what shows up first in your feed) by letting an AI agent explore options safely, instead of humans tweaking knobs by hand.

#GEARS#agentic ranking#Specialized Agent Skills

Parallel-Probe: Towards Efficient Parallel Thinking via 2D Probing

Intermediate

Tong Zheng, Chengsong Huang et al.Feb 3arXiv

Parallel-Probe is a simple add-on that lets many AI “thought paths” think at once but stop early when they already agree.

#parallel thinking#2D probing#consensus-based early stopping

Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization

Intermediate

Haocheng Xi, Shuo Yang et al.Feb 3arXiv

Auto-regressive video models make videos one chunk at a time but run out of GPU memory because the KV-cache grows with history.

#Quant VideoGen (QVG)#KV-cache quantization#2-bit quantization

Balancing Understanding and Generation in Discrete Diffusion Models

Intermediate

Yue Liu, Yuzhong Zhao et al.Feb 1arXiv

This paper introduces XDLM, a single model that blends two popular diffusion styles (masked and uniform) so it both understands and generates text and images well.

#XDLM#discrete diffusion#stationary noise kernel

ECO: Quantized Training without Full-Precision Master Weights

Intermediate

Mahdi Nikdan, Amir Zandieh et al.Jan 29arXiv

Training big AI models uses lots of memory because most methods still keep a secret full-precision copy of the weights called master weights.

#quantized training#master weights#error feedback

Toward Efficient Agents: Memory, Tool learning, and Planning

Intermediate

Xiaofang Yang, Lijun Li et al.Jan 20arXiv

This survey explains how to make AI agents not just smart, but also efficient with their time, memory, and tool use.

#agent efficiency#memory compression#tool learning

Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

Intermediate

Mike A. Merrill, Alexander G. Shaw et al.Jan 17arXiv

Terminal-Bench 2.0 is a tough test that checks how well AI agents can solve real, professional tasks by typing commands in a computer terminal.

#Terminal-Bench#command line interface#Docker containers

ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models

Intermediate

Long Lian, Sida Wang et al.Nov 24arXiv

ThreadWeaver teaches a language model to split big problems into smaller parts it can solve at the same time, like teammates working in parallel.

#adaptive parallel reasoning#fork–join#threaded inference

GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning

Intermediate

Lakshya A Agrawal, Shangyin Tan et al.Jul 25arXiv

GEPA is a new way to improve AI prompts by letting the AI read its own work, reflect in plain language on what went wrong, and then rewrite its instructions.

#GEPA#reflective prompt evolution#Pareto frontier