Groups
Knowledge distillation loss blends standard hard-label cross-entropy with a soft distribution match from a teacher using a temperature parameter.
Contrastive learning teaches models by pulling together similar examples (positives) and pushing apart dissimilar ones (negatives).