🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
📝Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers4

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#temporal localization

RIVER: A Real-Time Interaction Benchmark for Video LLMs

Intermediate
Yansong Shi, Qingsong Zhao et al.Mar 4arXiv

RIVER Bench is a new test that checks how well AI can watch a video stream and talk with you in real time.

#RIVER Bench#online video understanding#multimodal large language models

Agentic Very Long Video Understanding

Intermediate
Aniket Rege, Arka Sadhu et al.Jan 26arXiv

The paper tackles understanding super long, first‑person videos (days to a week) by giving an AI a smarter memory and better tools.

#entity scene graph#agentic planning#long-horizon video understanding

MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence

Intermediate
Jingli Lin, Runsen Xu et al.Dec 11arXiv

This paper introduces MMSI-Video-Bench, a big, carefully hand-made test to check how well AI understands space and motion in videos.

#video-based spatial intelligence#multimodal large language models#spatial construction

See, Hear, and Understand: Benchmarking Audiovisual Human Speech Understanding in Multimodal Large Language Models

Intermediate
Le Thien Phuc Nguyen, Zhuoran Yu et al.Dec 1arXiv

This paper introduces AV-SpeakerBench, a new test that checks if AI can truly see, hear, and understand who is speaking, what they say, and when they say it in real videos.

#audiovisual reasoning#speaker attribution#temporal grounding