Papers784

Diffusion In Diffusion: Reclaiming Global Coherence in Semi-Autoregressive Diffusion

The paper proposes Diffusion in Diffusion, a draft-then-revise method that brings back global coherence to fast, block-based diffusion language models.

#discrete diffusion#block diffusion#semi-autoregressive

Behavior Knowledge Merge in Reinforced Agentic Models

Intermediate

Xiangchi Yuan, Dachuan Shi et al.Jan 20arXiv

The paper solves a big problem: when you merge several reinforcement-learned models, their special skills get watered down by simple averaging.

#reinforcement learning#model merging#task vectors

A BERTology View of LLM Orchestrations: Token- and Layer-Selective Probes for Efficient Single-Pass Classification

Intermediate

Gonzalo Ariel Meyoyan, Luciano Del CorroJan 19arXiv

This paper shows how to add a tiny helper (a probe) to a big language model so it can classify things like safety or sentiment during the same pass it already does to answer you.

#LLM orchestration#single-pass classification#hidden-state probing

Aligning Agentic World Models via Knowledgeable Experience Learning

Intermediate

Baochang Ren, Yunzhi Yao et al.Jan 19arXiv

WorldMind teaches AI agents to learn the rules of the real world while they act, instead of cramming everything into fixed model weights.

#agentic world models#predictive coding#physical hallucinations

Typhoon ASR Real-time: FastConformer-Transducer for Thai Automatic Speech Recognition

Intermediate

Warit Sirichotedumrong, Adisai Na-Thalang et al.Jan 19arXiv

Big models like Whisper are great for accuracy but too slow for live captions; this paper builds a smaller, faster Thai speech recognizer for real-time use.

#Thai ASR#Streaming speech recognition#FastConformer-Transducer

Being-H0.5: Scaling Human-Centric Robot Learning for Cross-Embodiment Generalization

Intermediate

Hao Luo, Ye Wang et al.Jan 19arXiv

Being-H0.5 is a robot brain that learns from huge amounts of human videos and robot demos so it can work on many different robots, not just one.

#Vision-Language-Action model#Unified Action Space#Human-centric learning

Agentic Reasoning for Large Language Models

Intermediate

Tianxin Wei, Ting-Wei Li et al.Jan 18arXiv

This paper explains how to turn large language models (LLMs) from quiet students that only answer questions into active agents that can plan, act, and learn over time.

#Agentic Reasoning#LLM Agents#In-Context Learning

MMDeepResearch-Bench: A Benchmark for Multimodal Deep Research Agents

Intermediate

Peizhou Huang, Zixuan Zhong et al.Jan 18arXiv

This paper introduces MMDeepResearch-Bench (MMDR-Bench), a new test that checks how well AI “deep research agents” write long, citation-rich reports using both text and images.

#Multimodal Deep Research#Benchmark#Citation Grounding

ToolPRMBench: Evaluating and Advancing Process Reward Models for Tool-using Agents

Intermediate

Dawei Li, Yuguang Yao et al.Jan 18arXiv

ToolPRMBench is a new benchmark that checks, step by step, whether an AI agent using tools picks the right next action.

#process reward model#tool-using agents#offline sampling

Agentic-R: Learning to Retrieve for Agentic Search

Intermediate

Wenhan Liu, Xinyu Ma et al.Jan 17arXiv

Agentic-R is a new way to teach a search retriever to find not just similar text, but the text that truly helps an AI get the final answer right.

#agentic search#retriever training#passage utility modeling

Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

Intermediate

Mike A. Merrill, Alexander G. Shaw et al.Jan 17arXiv

Terminal-Bench 2.0 is a tough test that checks how well AI agents can solve real, professional tasks by typing commands in a computer terminal.

#Terminal-Bench#command line interface#Docker containers

UniX: Unifying Autoregression and Diffusion for Chest X-Ray Understanding and Generation

Intermediate

Ruiheng Zhang, Jingfeng Yao et al.Jan 16arXiv

UniX is a new medical AI that both understands chest X-rays (writes accurate reports) and generates chest X-ray images (high visual quality) without making the two jobs fight each other.

#UniX#autoregressive branch#diffusion branch

23 24 25 26 27