Papers5

#model merging

Transformers converge to invariant algorithmic cores

Different transformers may have very different weights, but they often hide the same tiny "engine" inside that actually does the task.

#algorithmic cores#mechanistic interpretability#transformers

Not triaged yet

Decouple Searching from Training: Scaling Data Mixing via Model Merging for Large Language Model Pre-training

Intermediate

Shengrui Li, Fei Zhao et al.Jan 31arXiv

Training big language models works best when you mix the right kinds of data (general, math, code), but finding the best mix used to be slow and very expensive.

#data mixture optimization#model merging#weighted model merging

Not triaged yet

Behavior Knowledge Merge in Reinforced Agentic Models

Intermediate

Xiangchi Yuan, Dachuan Shi et al.Jan 20arXiv

The paper solves a big problem: when you merge several reinforcement-learned models, their special skills get watered down by simple averaging.

#reinforcement learning#model merging#task vectors

Not triaged yet

EpiCaR: Knowing What You Don't Know Matters for Better Reasoning in LLMs

Intermediate

Jewon Yeom, Jaewon Sok et al.Jan 11arXiv

This paper teaches AI models not just how to solve problems but also how to tell when their own answers might be wrong.

#EPICAR#calibration#epistemic uncertainty

Not triaged yet

QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management

Intermediate

Weizhou Shen, Ziyi Yang et al.Dec 15arXiv

QwenLong-L1.5 is a training recipe that helps AI read and reason over very long documents by improving the data it learns from, the way it is trained, and how it remembers important stuff.

#long-context reasoning#reinforcement learning#GRPO

Not triaged yet