My binary vector search is better than your FP32 vectors

Aditya k

·Mar 25, 2024

Mar 25, 2024

Well documented article. Insightful i must say.

·1 reply

Ce Gao

Author

·Mar 26, 2024

Author

·Mar 26, 2024

Thanks!

Zakaria El hjouji

·Mar 29, 2024

Mar 29, 2024

Great documentation! I wonder if there's room to tune the threshold by which we decide whether a float is 0 or 1. Obviously, the threshold tuning should be done on a sample of the data. What do you think?

·1 reply

Ce Gao

Author

·Apr 1, 2024

Author

·Apr 1, 2024

Thanks! It would be an improvement if we adjust the cutoff point to decide whether a number is considered 0 or 1. Currently, we just convert positive numbers to 1 and negative numbers to 0

Yuxin Xu

·Apr 1, 2024

Apr 1, 2024

Can I ask how we got the calculation of "can reduce memory usage by up to 32 times"? Thank you! Been stuck in that step for a while not knowing why.

·2 replies

Yuxin Xu

·Apr 1, 2024

Apr 1, 2024

Is it because of the below example (20/0.6 = 33.3333...)?

Ce Gao

Author

·Apr 2, 2024

Author

·Apr 2, 2024

Yuxin Xu Hello, that statement represents a theoretical result. In the case of FP32 (floating-point 32-bit) format, it requires 32 bits to store values, whereas a binary vector only needs 1 bit per element.

Trung Dinh

·Apr 2, 2024

Apr 2, 2024

May I ask how the shortening index is performed? Did you use PCA or some dimension reduction method?

·1 reply

Ce Gao

Author

·Apr 2, 2024

Author

·Apr 2, 2024

No, that is actually a new feature of the OpenAI embedding model. You have the ability to selectively discard or drop certain dimensions, and the model will still function appropriately.

Under the hood is Matryoshka Representation Learning aniketrege.github.io/blog/2024/mrl

Jans Aasman

·Apr 2, 2024

Apr 2, 2024

Cool article. I understand how you transform the fp32 vector into a bitvector, but how do you do the nearest neighbor on the set of bit vectors to get the initial 200 you describe in your article?. Do you use brute-force, or annoy, or something else?

·1 reply

Ce Gao

Author

·Apr 3, 2024

Author

·Apr 3, 2024

Still using HNSW to build an index for the bit vectors. It takes 0.6GB to store the indexes. In the experiment we use our own pgvecto.rs, the vector search extension in Postgres.

Jettro Coenradie

·Apr 22, 2024

Apr 22, 2024

I understand the reduction of memory using binary vectors. However, if you use the normal vectors for knn re-ranking, you still need the complete vectors for all items, right? That sounds like you need even more memory. Can you elaborate on that?

·2 replies

Ce Gao

Author

·Apr 23, 2024

Author

·Apr 23, 2024

Thanks for the question. You need to store the full-precision vector data, but you can build the index with binary vectors. The memory usage of the index can be reduced. Data and index, are two different things that need storage.

Jettro Coenradie

·Apr 23, 2024

Apr 23, 2024

Thank you, I get it now. Ce Gao

My binary vector search is better than your FP32 vectors

14 comments