4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation
IntermediateChiao-An Yang, Ryo Hachiuma et al.Dec 18arXiv
This paper teaches a video-understanding AI to think in 3D plus time (4D) so it can answer questions about specific objects moving in videos.
#4D perception#multimodal large language models#perceptual distillation