🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
📝Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers7

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#pseudo-labeling

Tool Verification for Test-Time Reinforcement Learning

Intermediate
Ruotong Liao, Nikolai Röhrich et al.Mar 2arXiv

The paper fixes a big flaw in test-time reinforcement learning (TTRL): when many wrong answers agree, the model rewards the mistake and gets stuck.

#test-time reinforcement learning#verification-weighted voting#tool verification

Retrieve and Segment: Are a Few Examples Enough to Bridge the Supervision Gap in Open-Vocabulary Segmentation?

Intermediate
Tilemachos Aravanis, Vladan Stojnić et al.Feb 26arXiv

This paper teaches an AI to segment any object you name (open-vocabulary) much better by adding a few example pictures with pixel labels and smart retrieval.

#open-vocabulary segmentation#vision-language models#retrieval-augmented

Large Multimodal Models as General In-Context Classifiers

Intermediate
Marco Garosi, Matteo Farina et al.Feb 26arXiv

People often pick CLIP-like models for image labeling, but this paper shows that large multimodal models (LMMs) can be just as good—or even better—when you give them a few examples in the prompt (in-context learning).

#in-context learning#multimodal models#open-world classification

MetricAnything: Scaling Metric Depth Pretraining with Noisy Heterogeneous Sources

Intermediate
Baorui Ma, Jiahui Yang et al.Jan 29arXiv

Metric Anything is a new way to teach AI real, ruler-like distances (metric depth) from very mixed and noisy 3D data.

#metric depth estimation#sparse metric prompt#monocular depth

VideoMaMa: Mask-Guided Video Matting via Generative Prior

Intermediate
Sangbeom Lim, Seoung Wug Oh et al.Jan 20arXiv

VideoMaMa is a model that turns simple black-and-white object masks into soft, precise cutouts (alpha mattes) for every frame of a video.

#video matting#alpha matte#binary segmentation mask

An Empirical Study on Preference Tuning Generalization and Diversity Under Domain Shift

Intermediate
Constantinos Karouzos, Xingwei Tan et al.Jan 9arXiv

Preference tuning teaches language models to act the way people like, but those habits can fall apart when the topic or style changes (domain shift).

#preference tuning#domain shift#supervised fine-tuning

Depth Any Panoramas: A Foundation Model for Panoramic Depth Estimation

Intermediate
Xin Lin, Meixi Song et al.Dec 18arXiv

This paper builds a foundation model called DAP that estimates real-world (metric) depth from any 360° panorama, indoors or outdoors.

#panoramic depth estimation#metric depth#360-degree vision