DrivePI: Spatial-aware 4D MLLM for Unified Autonomous Driving Understanding, Perception, Prediction and Planning
IntermediateZhe Liu, Runhui Huang et al.Dec 14arXiv
DrivePI is a single, small (0.5B) multimodal language model that sees with cameras and LiDAR, talks in natural language, and plans driving actions all at once.
#DrivePI#Vision-Language-Action#3D occupancy