This paper teaches AI models to judge how sure they are about an answer and to think again if they are not sure.
DiffCoT treats a model’s step-by-step thinking (Chain-of-Thought) like a messy draft that can be cleaned up over time, not something fixed forever.