๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐ŸงฉProblems๐ŸŽฏPrompts๐Ÿง Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers4

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#benchmarking LLMs

No Shortcuts to Culture: Indonesian Multi-hop Question Answering for Complex Cultural Understanding

Intermediate
Vynska Amalia Permadi, Xingwei Tan et al.Feb 3arXiv

This paper builds ID-MoCQA, a new two-step (multi-hop) quiz set about Indonesian culture that makes AI connect clues before answering.

#multi-hop question answering#cultural reasoning#Indonesian culture

Decouple Searching from Training: Scaling Data Mixing via Model Merging for Large Language Model Pre-training

Intermediate
Shengrui Li, Fei Zhao et al.Jan 31arXiv

Training big language models works best when you mix the right kinds of data (general, math, code), but finding the best mix used to be slow and very expensive.

#data mixture optimization#model merging#weighted model merging

TAM-Eval: Evaluating LLMs for Automated Unit Test Maintenance

Intermediate
Elena Bruches, Vadim Alperovich et al.Jan 26arXiv

This paper introduces TAM-Eval, a new way to test how well AI models can create, fix, and update unit tests for real software projects.

#unit test maintenance#LLM for software engineering#reference-free evaluation

MentraSuite: Post-Training Large Language Models for Mental Health Reasoning and Assessment

Intermediate
Mengxi Xiao, Kailai Yang et al.Dec 10arXiv

MentraSuite is a complete toolkit that teaches large language models (LLMs) to reason about mental health step by step, not just sound caring.

#mental health reasoning#LLM post-training#supervised fine-tuning