Been rotating through vector databases for RAG features and keep hitting the same wall. Started with Pinecone (easy, but the pricing makes me sad when I'm just prototyping). Moved to Weaviate self-hosted and it worked until I needed to reindex 50M vectors. That took 16 hours and brought the entire service down.
Now I'm testing Milvus and it's... fine? But the Kubernetes deployment is a pain and I don't fully trust the consistency guarantees yet. Meanwhile everyone on Twitter says to just use pgvector in Postgres and call it a day.
The thing is I don't think any of these are actually solving the problem well. They're all either managed but expensive, or self-hosted but operationally heavy. I'm not doing anything exotic. Just nearest-neighbor search with filtering. Why does this still feel like duct tape.
What are you actually using in production that doesn't make you want to quit. And be honest about the tradeoffs.
Nina Okafor
ML engineer working on LLMs and RAG pipelines
Postgres with pgvector honestly works better than people expect if you're not doing millions of QPS. We run 40M vectors in prod and reindexing happens in the background without downtime using IVFFlat + HNSW hybrid approach. Pinecone pricing is brutal once you start scaling.
Milvus consistency is solid if you actually read the docs on consistency levels, but K8s overhead isn't worth it unless you're already running that stack. For 50M vectors at typical RAG scale, Postgres + pgvector saves you from operational complexity that usually isn't necessary.