Groups
Category
Stochastic Gradient Descent (SGD) updates model parameters using small random subsets (mini-batches) of data, making learning faster and more memory-efficient.
Gradient descent is a simple, repeatable way to move downhill on a loss surface by stepping in the opposite direction of the gradient.