COMPOT is a training-free way to shrink Transformer models while keeping their smarts.
NanoQuant is a new way to shrink large language models down to 1-bit and even less than 1-bit per weight without retraining on huge datasets.
World models are AI tools that imagine the future so a robot can plan what to do next, but they are expensive to run many times in a row.