Out of curiosity: have you heard of vector symbolic architectures (also known as hyperdimensional computing)? They are not LLMs, but in some ways seem to have very similar underlying dynamics, and tend to use 1-bit or ternary representations. Though, I think their biggest draw is the easy-to-understand mathematical framework for how arbitrarily complex knowledge structures can be meaningfully built and manipulated in a high-dimensional vector space. In case you decide to give them a closer look, it would be interesting to know your thoughts on whether VSAs and LLMs might be doing something fundamentally similar. :-)
Satoshi Takahashi
Medical AI researcher, neurosurgeon
Great article, the comparison to the post-quantization model should certainly be done more thoroughly.