Groups
Gradient clipping limits how large gradient values or their overall magnitude can become during optimization to prevent exploding updates.
Learning rate schedules control how fast a model learns over time by changing the learning rate across iterations or epochs.