Papers2

#multilingual benchmark

Same Claim, Different Judgment: Benchmarking Scenario-Induced Bias in Multilingual Financial Misinformation Detection

This paper builds MFMD-Scen, a big test to see how AI changes its truth/false judgments about the same money-related claim when the situation around it changes.

#financial misinformation detection#scenario-induced bias#multilingual benchmark

Not triaged yet

TokSuite: Measuring the Impact of Tokenizer Choice on Language Model Behavior

Beginner

Gül Sena Altıntaş, Malikeh Ehghaghi et al.Dec 23arXiv

TokSuite is a science lab for tokenizers: it trains 14 language models that are identical in every way except for how they split text into tokens.

#tokenization#tokenizer robustness#Byte Pair Encoding (BPE)

Not triaged yet