Papers4

#reflection

Understanding vs. Generation: Navigating Optimization Dilemma in Multimodal Models

Big idea: Make image-making AIs stop, think, check, and fix their own work so they get better at both creating pictures and understanding them.

#multimodal models#image generation#reasoning

Not triaged yet

OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agent

Intermediate

Bowen Yang, Kaiming Jin et al.Jan 12arXiv

Computer-using agents kept forgetting important visual details over long tasks and could not reliably find up-to-date, step-by-step help for unfamiliar apps.

#computer-using agents#vision-language models#milestone memory

Not triaged yet

Illusions of Confidence? Diagnosing LLM Truthfulness via Neighborhood Consistency

Beginner

Haoming Xu, Ningyuan Zhao et al.Jan 9arXiv

LLMs can look confident but still change their answers when the surrounding text nudges them, showing that confidence alone isn’t real truthfulness.

#Neighbor-Consistency Belief#belief robustness#self-consistency

Not triaged yet

Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process

Beginner

Zhenyu Zhang, Shujian Zhang et al.Dec 30arXiv

This paper shows a new way (called RISE) to find and control how AI models think without needing any human-made labels.

#RISE#sparse auto-encoder#reasoning vectors

Not triaged yet