Groups
Mixed precision training stores and computes tensors in low precision (FP16/BF16) for speed and memory savings while keeping a master copy of weights in FP32 for accurate updates.
Matrix factorizations rewrite a matrix into simpler building blocks (triangular or orthogonal) that make solving and analyzing linear systems much easier.