Google Dropped TurboQuant Two Weeks Ago. The Community Already Made It Usable.
Google published the TurboQuant paper on March 25. It's April 7. There are already five independent implementations, a llama.cpp fork running 104B parameter models on a MacBook, and an active vLLM integration effort. Google hasn't released a single l...
alan-west.hashnode.dev8 min read