Groups
Category
Level
Data parallelism splits the training data across workers that compute gradients in parallel on a shared model.