Unlocking 70% Faster Response Times Through Token Pooling
TLDRThis post examines improvements made to ColiVara, our ColPali-based retrieval API. We focus on hybrid search and hierarchical clustering token pooling. By benchmarking these two approaches, we aim to evaluate their impact on latency and performan...
blog.colivara.com8 min read