NanoQuant: Efficient Sub-1-Bit Quantization of Large Language Models
IntermediateHyochan Chong, Dongkyu Kim et al.Feb 6arXiv
NanoQuant is a new way to shrink large language models down to 1-bit and even less than 1-bit per weight without retraining on huge datasets.
#post-training quantization#sub-1-bit quantization#binary LLMs