infinity icon indicating copy to clipboard operation
infinity copied to clipboard

[Feature Request]: Reducing tensor storage overhead through token pooling for any ColBERT-like late interaction models

Open yingfeng opened this issue 1 year ago • 1 comments

Is there an existing issue for the same feature request?

  • [X] I have checked the existing issues.

Is your feature request related to a problem?

No response

Describe the feature you'd like

https://arxiv.org/abs/2409.14683

Reducing the Footprint of Multi-Vector Retrieval with Minimal Performance Impact via Token Pooling

Token pooling and binary quantization are orthogonal

Describe implementation you've considered

No response

Documentation, adoption, use case

No response

Additional information

No response

yingfeng avatar Sep 24 '24 08:09 yingfeng

Another token pooling strategy: https://www.answer.ai/posts/colbert-pooling.html

yingfeng avatar Sep 25 '24 07:09 yingfeng