🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers4

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#Benchmark

AdaptMMBench: Benchmarking Adaptive Multimodal Reasoning for Mode Selection and Reasoning Process

Intermediate
Xintong Zhang, Xiaowen Zhang et al.Feb 2arXiv

AdaptMMBench is a new test that checks if AI models know when to just look and think, and when to use extra visual tools like zooming or brightening an image.

#Adaptive Multimodal Reasoning#Vision-Language Models#Tool Invocation

Wiki Live Challenge: Challenging Deep Research Agents with Expert-Level Wikipedia Articles

Beginner
Shaohan Wang, Benfeng Xu et al.Feb 2arXiv

This paper builds a live challenge that tests how well Deep Research Agents (DRAs) can write expert-level Wikipedia-style articles.

#Deep Research Agents#Wikipedia Good Articles#Benchmark

AACR-Bench: Evaluating Automatic Code Review with Holistic Repository-Level Context

Intermediate
Lei Zhang, Yongda Yu et al.Jan 27arXiv

AACR-Bench is a new test set that checks how well AI can do code reviews using the whole project, not just one file.

#Automated Code Review#Benchmark#Repository-level Context

MMDeepResearch-Bench: A Benchmark for Multimodal Deep Research Agents

Intermediate
Peizhou Huang, Zixuan Zhong et al.Jan 18arXiv

This paper introduces MMDeepResearch-Bench (MMDR-Bench), a new test that checks how well AI “deep research agents” write long, citation-rich reports using both text and images.

#Multimodal Deep Research#Benchmark#Citation Grounding