Unlocking 70% Faster Response Times Through Token Pooling
TLDRThis post examines improvements made to ColiVara, our ColPali-based retrieval API. We focus on hybrid search and hierarchical clustering token pooling. By benchmarking these two approaches, we aim to evaluate their impact on latency and performan...



