Groups
Knowledge distillation loss blends standard hard-label cross-entropy with a soft distribution match from a teacher using a temperature parameter.
Focal Loss reshapes cross-entropy so that hard, misclassified examples get more focus while easy, well-classified ones are down-weighted.