AgentArk: Distilling Multi-Agent Intelligence into a Single LLM Agent
IntermediateYinyi Luo, Yiqiao Jin et al.Feb 3arXiv
AgentArk teaches one language model to think like a whole team of models that debate, so it can solve tough problems quickly without running a long, expensive debate at answer time.
#multi-agent distillation#process reward model#GRPO