Groups
Category
Level
Data parallelism splits the training data across workers that compute gradients in parallel on a shared model.
ADMM splits a hard optimization problem into two easier subproblems that communicate through simple averaging-like steps.