Papers2

#reading comprehension

Lost in the Prompt Order: Revealing the Limitations of Causal Attention in Language Models

Putting the reading passage before the question and answer choices (CQO) makes language models much more accurate than putting it after (QOC), by about 15 percentage points on average.

#causal attention#prompt order sensitivity#multiple-choice question answering

Not triaged yet

FIN-bench-v2: A Unified and Robust Benchmark Suite for Evaluating Finnish Large Language Models

Intermediate

Joona Kytöniemi, Jousia Piha et al.Dec 15arXiv

FIN-bench-v2 is a big, tidy set of Finnish tests that checks how good large language models are at many things like reading, logic, and world knowledge.

#Finnish language models#benchmark suite#HuggingFace Datasets

Not triaged yet