TimeChat-Captioner: Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions
IntermediateLinli Yao, Yuancheng Wei et al.Feb 9arXiv
This paper teaches AI to write movie-like scripts for videos by adding exact timestamps and rich details about what you see and hear.
#Omni Dense Captioning#time-aware video captioning#audio-visual understanding