🎓How I Study AIHISA

📖Read

📄Papers 📰Blogs 🎬Courses

💡Learn

🛤️Paths 📚Topics 💡Concepts 🎴Shorts

🎯Practice

🧩Problems 🎯Prompts 🧠Review

How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers2

All Beginner Intermediate Advanced

All Sources arXiv

#video understanding

Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation

Zhe Huang, Hao Wen et al.Dec 30arXiv

Multimodal Large Language Models (MLLMs) often hallucinate on videos by trusting words and common sense more than what the frames really show.

#multimodal large language model#video understanding#visual hallucination

TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs

Jun Zhang, Teng Wang et al.Dec 16arXiv

TimeLens studies how to teach AI not just what happens in a video, but exactly when it happens, which is called video temporal grounding (VTG).

#video temporal grounding#multimodal large language models#benchmark re-annotation