Groups
Sharpness-Aware Minimization (SAM) trains models to perform well even when their weights are slightly perturbed, seeking flatter minima that generalize better.
Newton's method uses both the gradient and the Hessian to take steps that aim directly at the local optimum by fitting a quadratic model of the loss around the current point.