πŸŽ“How I Study AIHISA
πŸ“–Read
πŸ“„PapersπŸ“°Blogs🎬Courses
πŸ’‘Learn
πŸ›€οΈPathsπŸ“šTopicsπŸ’‘Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers3

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#benchmark dataset

PingPong: A Natural Benchmark for Multi-Turn Code-Switching Dialogues

Intermediate
Mohammad Rifqi Farhansyah, Hanif Muhammad Zhafran et al.Jan 24arXiv

Most people on Earth speak more than one language and often switch languages in the same chat, but AI tools aren’t tested well on this real behavior.

#code-switching#multilingual NLP#trilingual dialogue

EvasionBench: A Large-Scale Benchmark for Detecting Managerial Evasion in Earnings Call Q&A

Intermediate
Shijian Ma, Yan Lin et al.Jan 14arXiv

EvasionBench is a new, very large dataset that helps computers spot when company leaders dodge questions during earnings call Q&A.

#evasion detection#earnings call Q&A#financial NLP

Beyond Memorization: A Multi-Modal Ordinal Regression Benchmark to Expose Popularity Bias in Vision-Language Models

Beginner
Li-Zhong Szu-Tu, Ting-Lin Wu et al.Dec 24arXiv

The paper builds YearGuessr, a giant, worldwide photo-and-text dataset of 55,546 buildings with their construction years (1001–2024), GPS, and popularity (page views).

#YearGuessr#building age estimation#ordinal regression