CGSorry for the late reply. We used https://myscale.github.io/benchmark/#/Reply·Article·Dec 5, 2024·pgvector vs. pgvecto.rs in 2024: A Comprehensive Comparison for Vector Search in PostgreSQL
CGThanks for the question. You need to store the full-precision vector data , but you can build the index with binary vectors. The memory usage of the index can be reduced. Data and index, are two different things that need storage.Reply·Article·Apr 23, 2024·My binary vector search is better than your FP32 vectors
CGStill using HNSW to build an index for the bit vectors. It takes 0.6GB to store the indexes. In the experiment we use our own pgvecto.rs, the vector search extension in Postgres.Reply·Article·Apr 3, 2024·My binary vector search is better than your FP32 vectors
CGNo, that is actually a new feature of the OpenAI embedding model. You have the ability to selectively discard or drop certain dimensions, and the model will still function appropriately. Under the hood is Matryoshka Representation Learning https://aniketrege.github.io/blog/2024/mrl/Reply·Article·Apr 2, 2024·My binary vector search is better than your FP32 vectors
CGYuxin Xu Hello, that statement represents a theoretical result. In the case of FP32 (floating-point 32-bit) format, it requires 32 bits to store values, whereas a binary vector only needs 1 bit per element.Reply·Article·Apr 2, 2024·My binary vector search is better than your FP32 vectors
CGThanks! It would be an improvement if we adjust the cutoff point to decide whether a number is considered 0 or 1. Currently, we just convert positive numbers to 1 and negative numbers to 0Reply·Article·Apr 1, 2024·My binary vector search is better than your FP32 vectors