Unlocking 70% Faster Response Times Through Token Pooling
Dec 2, 2024 · 8 min read · TLDRThis post examines improvements made to ColiVara, our ColPali-based retrieval API. We focus on hybrid search and hierarchical clustering token pooling. By benchmarking these two approaches, we aim to evaluate their impact on latency and performan...
Join discussion


