Papers1262

Half-Truths Break Similarity-Based Retrieval

Bora Kargi, Arnas Uselis et al.Feb 27arXiv

Similarity-based image–text models like CLIP can be fooled by “half-truths,” where adding one plausible but wrong detail makes a caption look more similar to an image instead of less similar.

#half-truth vulnerability#similarity-based retrieval#CLIP

Not triaged yet

Ref-Adv: Exploring MLLM Visual Reasoning in Referring Expression Tasks

Intermediate

Qihua Dong, Kuo Yang et al.Feb 27arXiv

This paper builds a new test called Ref-Adv to check if AI can truly match tricky sentences to the right thing in a picture.

#Referring Expression Comprehension#Visual Grounding#Multimodal Large Language Models

Not triaged yet

LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding

Intermediate

Alexander Samarin, Sergei Krutikov et al.Feb 27arXiv

Speculative decoding speeds up big language models by letting a small helper model guess several next words and having the big model check them all at once.

#speculative decoding#acceptance rate#LK losses

Not triaged yet

SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale

Intermediate

Ibragim Badertdinov, Maksim Nekrashevich et al.Feb 27arXiv

SWE-rebench V2 is a giant, language-agnostic robot pipeline that turns real GitHub pull requests into safe, runnable software tasks for training AI coding agents.

#SWE-rebench V2#software engineering agents#reinforcement learning

Not triaged yet

Humans and LLMs Diverge on Probabilistic Inferences

Beginner

Gaurav Kamath, Sreenath Madathil et al.Feb 26arXiv

Humans often make guesses about the world that are likely but not certain, and this paper studies how humans and AI compare at doing that.

#probabilistic reasoning#uncertainty calibration#natural language inference

Not triaged yet

Synthetic Visual Genome 2: Extracting Large-scale Spatio-Temporal Scene Graphs from Videos

Beginner

Ziqi Gao, Jieyu Zhang et al.Feb 26arXiv

This paper builds a giant, automatically made video library called SVG2 that tells who is in a video, what they look like, and how they interact over time.

#video scene graph#spatio-temporal reasoning#panoptic segmentation

Not triaged yet

CiteAudit: You Cited It, But Did You Read It? A Benchmark for Verifying Scientific References in the LLM Era

Beginner

Zhengqing Yuan, Kaiwen Shi et al.Feb 26arXiv

The paper tackles a new integrity problem in science: large language models sometimes invent realistic-looking citations that do not exist.

#citation verification#hallucinated citations#scholarly integrity

Not triaged yet

Truncated Step-Level Sampling with Process Rewards for Retrieval-Augmented Reasoning

Beginner

Chris Samarinas, Haw-Shiuan Chang et al.Feb 26arXiv

SLATE is a new way to teach AI to think step by step while using a search engine, giving feedback at each step instead of only at the end.

#retrieval-augmented reasoning#reinforcement learning#GRPO

Not triaged yet

MediX-R1: Open Ended Medical Reinforcement Learning

Beginner

Sahal Shaji Mullappilly, Mohammed Irfan Kurpath et al.Feb 26arXiv

MediX-R1 teaches medical AI models to give clear, free-form answers (not just A, B, C, or D) and to explain their thinking.

#medical multimodal RL#open-ended reinforcement learning#composite reward

Not triaged yet

SeeThrough3D: Occlusion Aware 3D Control in Text-to-Image Generation

Intermediate

Vaibhav Agrawal, Rishubh Parihar et al.Feb 26arXiv

SeeThrough3D teaches image generators to understand what should be visible and what should be hidden when objects overlap, just like in real life.

#occlusion-aware generation#3D layout control#text-to-image

Not triaged yet

Retrieve and Segment: Are a Few Examples Enough to Bridge the Supervision Gap in Open-Vocabulary Segmentation?

Intermediate

Tilemachos Aravanis, Vladan Stojnić et al.Feb 26arXiv

This paper teaches an AI to segment any object you name (open-vocabulary) much better by adding a few example pictures with pixel labels and smart retrieval.

#open-vocabulary segmentation#vision-language models#retrieval-augmented

Not triaged yet

AgentDropoutV2: Optimizing Information Flow in Multi-Agent Systems via Test-Time Rectify-or-Reject Pruning

Intermediate

Yutong Wang, Siyuan Xiong et al.Feb 26arXiv

Multi-agent systems are like teams of smart helpers, but one bad message can mislead the whole team.

#multi-agent systems#error propagation#test-time rectification

Not triaged yet

7 8 9 10 11