Papers2

#data-efficient training

Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning

Shaotian Yan, Kaiyuan Liu et al.Jan 14arXiv

The paper introduces DASD-4B-Thinking, a small (4B) open-source reasoning model that scores like much larger models on hard math, science, and coding tests.

#sequence-level distillation#divergence-aware sampling#temperature-scheduled learning

Not triaged yet

Talk2Move: Reinforcement Learning for Text-Instructed Object-Level Geometric Transformation in Scenes

Intermediate

Jing Tan, Zhaoyang Zhang et al.Jan 5arXiv

Talk2Move is a training recipe that lets an image editor move, rotate, and resize the exact object you mention using plain text, while keeping the rest of the picture stable.

#text-guided image editing#object-level transformation#reinforcement learning

Not triaged yet