Papers1061

Agentic Code Reasoning

The paper teaches AI agents to understand big codebases without running the code by following a strict, step-by-step thinking template called semi-formal reasoning.

#agentic code reasoning#semi-formal reasoning#patch equivalence

Not triaged yet

FireRed-OCR Technical Report

Intermediate

Hao Wu, Haoran Lou et al.Mar 2arXiv

FireRed-OCR turns a general vision-language model into a careful document reader that follows strict rules, so its outputs are usable in the real world.

#FireRed-OCR#structural hallucination#document parsing

Not triaged yet

Surgical Post-Training: Cutting Errors, Keeping Knowledge

Intermediate

Wenye Lin, Kai HanMar 2arXiv

The paper introduces SPOT, a training recipe that fixes an AI model’s mistakes with tiny edits while keeping what it already knows well.

#Surgical Post-Training#SPOT#DPO

Not triaged yet

Beyond Length Scaling: Synergizing Breadth and Depth for Generative Reward Models

Intermediate

Qiyuan Zhang, Yufei Wang et al.Mar 2arXiv

Longer explanations are not always better; the shape of thinking matters.

#Generative Reward Models#Chain-of-Thought#Breadth-CoT

Not triaged yet

RubricBench: Aligning Model-Generated Rubrics with Human Standards

Intermediate

Qiyuan Zhang, Junyi Zhou et al.Mar 2arXiv

RubricBench is a new benchmark that checks whether AI judges can use clear, checklist-style rules (rubrics) the way humans do.

#RubricBench#rubric-guided evaluation#reward models

Not triaged yet

LaSER: Internalizing Explicit Reasoning into Latent Space for Dense Retrieval

Intermediate

Jiajie Jin, Yanzhao Zhang et al.Mar 2arXiv

LaSER teaches a fast search model to “think” quietly inside its hidden space, so it gets the benefits of step-by-step reasoning without writing those steps out as text.

#dense retrieval#chain-of-thought#latent reasoning

Not triaged yet

When Does RL Help Medical VLMs? Disentangling Vision, SFT, and RL Gains

Intermediate

Ahmadreza Jeddi, Kimia Shaban et al.Mar 1arXiv

This paper asks a simple question: does reinforcement learning (RL) truly make medical vision-language models (VLMs) smarter, or just help them pick better from answers they already know?

#medical vision-language models#reinforcement learning#supervised fine-tuning

Not triaged yet

AgilePruner: An Empirical Study of Attention and Diversity for Adaptive Visual Token Pruning in Large Vision-Language Models

Intermediate

Changwoo Baek, Jouwon Song et al.Mar 1arXiv

Big picture: Vision-language models look at hundreds of image pieces (tokens), which makes them slow and sometimes chatty with mistakes called hallucinations.

#visual token pruning#attention-based pruning#diversity-based pruning

Not triaged yet

Learn Hard Problems During RL with Reference Guided Fine-tuning

Intermediate

Yangzhen Wu, Shanda Li et al.Mar 1arXiv

ReGFT is a simple pre-RL step that shows the model partial human hints, then makes it solve problems in its own words, creating correct, model-style solutions for hard questions.

#Reference-Guided Fine-Tuning#ReGFT#ReFT

Not triaged yet

ArtLLM: Generating Articulated Assets via 3D LLM

Intermediate

Penghao Wang, Siyuan Xie et al.Mar 1arXiv

ArtLLM is a 3D large language model that turns a rough 3D shape (from an image, text, or mesh) into a complete, movable 3D object with parts and joints.

#Articulated 3D objects#3D large language model#Point cloud understanding

Not triaged yet

Unified Vision-Language Modeling via Concept Space Alignment

Intermediate

Yifu Qiu, Paul-Ambroise Duquenne et al.Mar 1arXiv

The paper builds v-Sonar, a bridge that maps images and videos into the same meaning-space as text called Sonar, so all modalities “speak” the same language.

#v-Sonar#OmniSONAR#concept space alignment

Not triaged yet

LLaDA-o: An Effective and Length-Adaptive Omni Diffusion Model

Intermediate

Zebin You, Xiaolu Zhang et al.Mar 1arXiv

LLaDA-o is a new AI that understands pictures and text and can also make images, all in one model.

#LLaDA-o#Mixture of Diffusion#masked diffusion models

Not triaged yet

3 4 5 6 7