PROGRESSLM: Towards Progress Reasoning in Vision-Language Models
IntermediateJianshu Zhang, Chengxuan Qian et al.Jan 21arXiv
This paper asks a new question for vision-language models: not just 'What do you see?' but 'How far along is the task right now?'
#progress reasoning#vision-language models#episodic retrieval