Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning
IntermediateShaotian Yan, Kaiyuan Liu et al.Jan 14arXiv
The paper introduces DASD-4B-Thinking, a small (4B) open-source reasoning model that scores like much larger models on hard math, science, and coding tests.
#sequence-level distillation#divergence-aware sampling#temperature-scheduled learning