note: This paper was written with AI, and the exploration it describes was done collaboratively with AI. The "we" described here is us; me and a few models.
With llama.cpp model quantization, properly adjusting models to keep their performance after ...