The paper shows a fast, training-free way to boost an LLM’s step-by-step reasoning by smartly reusing the model’s own probabilities.
This paper teaches a model to be its own teacher so it can climb out of a learning plateau on very hard math problems.