My FeedDiscussionsHeadless CMS
New
Sign in
Log inSign up
Learn more about Hashnode Headless CMSHashnode Headless CMS
Collaborate seamlessly with Hashnode Headless CMS for Enterprise.
Upgrade ✨Learn more
Ce Gao

39 likes

·

19.8K reads

14 comments

Aditya k
Aditya k
Mar 25, 2024

Well documented article. Insightful i must say.

2
·
·1 reply
Ce Gao
Ce Gao
Author
·Mar 26, 2024

Thanks!

·
Zakaria El hjouji
Zakaria El hjouji
Mar 29, 2024

Great documentation! I wonder if there's room to tune the threshold by which we decide whether a float is 0 or 1. Obviously, the threshold tuning should be done on a sample of the data. What do you think?

2
·
·1 reply
Ce Gao
Ce Gao
Author
·Apr 1, 2024

Thanks! It would be an improvement if we adjust the cutoff point to decide whether a number is considered 0 or 1. Currently, we just convert positive numbers to 1 and negative numbers to 0

·
Yuxin Xu
Yuxin Xu
Apr 1, 2024

Can I ask how we got the calculation of "can reduce memory usage by up to 32 times"? Thank you! Been stuck in that step for a while not knowing why.

·
·2 replies
Yuxin Xu
Yuxin Xu
Apr 1, 2024

Is it because of the below example (20/0.6 = 33.3333...)?

1
·
Ce Gao
Ce Gao
Author
·Apr 2, 2024

Yuxin Xu Hello, that statement represents a theoretical result. In the case of FP32 (floating-point 32-bit) format, it requires 32 bits to store values, whereas a binary vector only needs 1 bit per element.

·
Trung Dinh
Trung Dinh
Apr 2, 2024

May I ask how the shortening index is performed? Did you use PCA or some dimension reduction method?

·
·1 reply
Ce Gao
Ce Gao
Author
·Apr 2, 2024

No, that is actually a new feature of the OpenAI embedding model. You have the ability to selectively discard or drop certain dimensions, and the model will still function appropriately.

Under the hood is Matryoshka Representation Learning aniketrege.github.io/blog/2024/mrl

·
Jans Aasman
Jans Aasman
Apr 2, 2024

Cool article. I understand how you transform the fp32 vector into a bitvector, but how do you do the nearest neighbor on the set of bit vectors to get the initial 200 you describe in your article?. Do you use brute-force, or annoy, or something else?

·
·1 reply
Ce Gao
Ce Gao
Author
·Apr 3, 2024

Still using HNSW to build an index for the bit vectors. It takes 0.6GB to store the indexes. In the experiment we use our own pgvecto.rs, the vector search extension in Postgres.

·
Jettro Coenradie
Jettro Coenradie
Apr 22, 2024

I understand the reduction of memory using binary vectors. However, if you use the normal vectors for knn re-ranking, you still need the complete vectors for all items, right? That sounds like you need even more memory. Can you elaborate on that?

·
·2 replies
Ce Gao
Ce Gao
Author
·Apr 23, 2024

Thanks for the question. You need to store the full-precision vector data, but you can build the index with binary vectors. The memory usage of the index can be reduced. Data and index, are two different things that need storage.

2
·
Jettro Coenradie
Jettro Coenradie
Apr 23, 2024

Thank you, I get it now. Ce Gao

·