Privileged Information Distillation for Language Models
IntermediateEmiliano Penaloza, Dheeraj Vattikonda et al.Feb 4arXiv
The paper shows how to train a language model with special extra hints (privileged information) during practice so it can still do well later without any hints.
#Privileged Information#Knowledge Distillation#ฯ-Distill