Concepts2

Groups

Mixed Precision Training

Mixed precision training stores and computes tensors in low precision (FP16/BF16) for speed and memory savings while keeping a master copy of weights in FP32 for accurate updates.

#mixed precision#fp16#bf16+10

⚙️AlgorithmIntermediate

Gradient Clipping & Normalization

Gradient clipping limits how large gradient values or their overall magnitude can become during optimization to prevent exploding updates.

#gradient clipping