RANKVIDEO is a video-native reasoning reranker that helps search engines find the right videos for a text query by directly looking at the videoβs visuals and audio, not just text captions.
This paper asks a new question for vision-language models: not just 'What do you see?' but 'How far along is the task right now?'