Papers2

All Beginner Intermediate Advanced

All Sources arXiv

#multimodal large language models

Demo-ICL: In-Context Learning for Procedural Video Knowledge Acquisition

Beginner

Yuhao Dong, Shulin Tian et al.Feb 9arXiv

This paper teaches AI to learn how-to steps from demonstrations in the moment, the way people do.

#video in-context learning#procedural video understanding#multimodal large language models

Not triaged yet

Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding

Beginner

Jiaqi Tang, Jianmin Chen et al.Dec 19arXiv

Robust-R1 teaches vision-language models to notice how a picture is damaged, think through what that damage hides, and then answer as if the picture were clear.

#Robust-R1#degradation-aware reasoning#multimodal large language models

Not triaged yet