MASQuant: Modality-Aware Smoothing Quantization for Multimodal Large Language Models
IntermediateLulu Hu, Wenhu Xiao et al.Mar 5arXiv
Multimodal AI models handle text, images, and audio, but their signals are very different in size, which breaks standard low‑bit compression methods.
#post‑training quantization#multimodal LLM#channel‑wise smoothing