Papers3

#On-Policy Distillation

GLM-5: from Vibe Coding to Agentic Engineering

GLM-5 Team, Aohan Zeng et al.Feb 17arXiv

GLM-5 is a new open-weight AI model that moves from 'vibe coding' (prompting the model to write code) to 'agentic engineering' (letting the model plan, build, test, and fix software on its own).

#GLM-5#Agentic Engineering#DeepSeek Sparse Attention

Not triaged yet

Reinforcement Learning via Self-Distillation

Intermediate

Jonas Hübotter, Frederike Lübeck et al.Jan 28arXiv

The paper teaches large language models to learn from detailed feedback (like error messages) instead of only a simple pass/fail score.

#Self-Distillation#Reinforcement Learning with Rich Feedback#SDPO

Not triaged yet

Self-Distillation Enables Continual Learning

Intermediate

Idan Shenfeld, Mehul Damani et al.Jan 27arXiv

This paper shows a simple way for AI models to keep learning new things without forgetting what they already know.

#Self-Distillation Fine-Tuning#On-Policy Distillation#Continual Learning

Not triaged yet