Ministral 3
BeginnerAlexander H. Liu, Kartik Khandelwal et al.Jan 13arXiv
Ministral 3 is a new family of small-but-mighty AI language models (3B, 8B, 14B) that learn from a larger model using a step-by-step tutoring method called Cascade Distillation.
#Cascade Distillation#Model pruning#Logit distillation