Papers2

#activation patching

Lost in the Prompt Order: Revealing the Limitations of Causal Attention in Language Models

Putting the reading passage before the question and answer choices (CQO) makes language models much more accurate than putting it after (QOC), by about 15 percentage points on average.

#causal attention#prompt order sensitivity#multiple-choice question answering

Not triaged yet

Mechanistic Interpretability of Large-Scale Counting in LLMs through a System-2 Strategy

Intermediate

Hosein Hasani, Mohammadali Banayeeanzade et al.Jan 6arXiv

Large language models (LLMs) are good at many math problems but often mess up simple counting when the list gets long.

#mechanistic interpretability#counting in LLMs#System-2 prompting

Not triaged yet