How quantization keeps vector search in RAM
I've been building a RAG system over a terabyte of scanned engineering manuals tens of millions of pages, chunked into roughly six million searchable pieces. At that scale the interesting problems sto
manavgupta.hashnode.dev8 min read