🎓How I Study AIHISA

📖Read

📄Papers 📰Blogs 🎬Courses

💡Learn

🛤️Paths 📚Topics 💡Concepts 🎴Shorts

🎯Practice

📝Daily Log 🎯Prompts 🧠Review

Search Settings

How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers2

All Beginner Intermediate Advanced

All Sources arXiv

#cross-attention fusion

Uncertainty-Aware Vision-Language Segmentation for Medical Imaging

Aryan Das, Tanishq Rachamalla et al.Feb 16arXiv

This paper builds a medical image segmentation system that uses both pictures (like X-rays) and words (short clinical text) at the same time.

#medical image segmentation#vision-language segmentation#uncertainty estimation

JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation

Kai Liu, Jungang Li et al.Dec 28arXiv

JavisGPT is a single AI that can both understand sounding videos (audio + video together) and also create new ones that stay in sync.

#multimodal large language model#audio-video synchronization#SyncFusion