This paper makes training giant AI models faster and lighter on memory by inventing a new way to split tensors called RaggedShard.