From Segments to Scenes: Temporal Understanding in Autonomous Driving via Vision-Language Model
IntermediateKevin Cannons, Saeed Ranjbar Alvar et al.Dec 4arXiv
This paper builds TAD, a brand-new test that checks if AI can understand what happens over time in real driving videos.
#Temporal understanding#Autonomous driving#Vision-language models