Groups
Category
Adam is an optimization algorithm that combines momentum (first moment) with RMSProp-style adaptive learning rates (second moment).
Momentum methods add an exponentially weighted memory of past gradients to make descent steps smoother and faster, especially in ravines and ill-conditioned problems.