Groups
Category
Knowledge distillation loss blends standard hard-label cross-entropy with a soft distribution match from a teacher using a temperature parameter.
KullbackโLeibler (KL) divergence measures how one probability distribution P devotes probability mass differently from a reference distribution Q.