Papers2

#logit lens

Large Reasoning Models Are (Not Yet) Multilingual Latent Reasoners

Yihong Liu, Raoyuan Zhao et al.Jan 6arXiv

Large reasoning models can often find the right math answer in their “head” before finishing their written steps, but this works best in languages with lots of training data like English and Chinese.

#latent reasoning#chain-of-thought#multilingual LLMs

Not triaged yet

Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

Beginner

Yuqiao Tan, Minzheng Wang et al.Dec 22arXiv

Large language models (LLMs) don’t act as a single brain; inside, each layer and module quietly makes its own mini-decisions called internal policies.

#Bottom-up Policy Optimization#internal layer policy#internal modular policy

Not triaged yet