Groups
Gradient clipping limits how large gradient values or their overall magnitude can become during optimization to prevent exploding updates.
Adam is an optimization algorithm that combines momentum (first moment) with RMSProp-style adaptive learning rates (second moment).