LLMs can look confident but still change their answers when the surrounding text nudges them, showing that confidence alone isnβt real truthfulness.
This paper shows a new way (called RISE) to find and control how AI models think without needing any human-made labels.