Groups
Knowledge distillation loss blends standard hard-label cross-entropy with a soft distribution match from a teacher using a temperature parameter.
CTC loss trains sequence models when you do not know the alignment between inputs (frames) and outputs (labels).