haystack-core-integrations icon indicating copy to clipboard operation
haystack-core-integrations copied to clipboard

Fastembed Colbert Reranker

Open xoaryaa opened this issue 1 month ago • 4 comments

Related Issues

  • fixes https://github.com/deepset-ai/haystack/issues/8245

Proposed Changes:

Feature: Add FastembedColbertReranker — a ColBERT late-interaction (MaxSim) reranker inside the FastEmbed integration.

  • New component: haystack_integrations.components.rankers.fastembed.FastembedColbertReranker
  • Uses FastEmbed’s LateInteractionTextEmbedding to encode query and documents (token-level).
  • Implements ColBERT MaxSim scoring: sum_q max_d (sim(q_i, d_j)) with configurable similarity={"cosine","dot"} and optional L2 normalization.
  • CPU-friendly ONNX backend via FastEmbed; no new core Haystack deps.
  • Batching support (batch_size) and optional token limits (max_query_tokens, max_doc_tokens).
  • Stable sorting with deterministic tie-break; parameter validation and robust warm-up tolerant to minor FastEmbed kwargs drift.
  • Small README note + CHANGELOG entry for the FastEmbed integration.
  • Example script: examples/fastembed_colbert_reranker.py.

Why: Cross-encoder rankers don’t cover late-interaction/bi-encoder models like ColBERT v2. This adds a first-class ColBERT reranker over ~100–500 retrieved candidates.

How did you test it?

  • Unit tests

    • Pure math checks for MaxSim and L2 normalization.
    • Parameter validation (invalid similarity, batch_size, top_k).
    • Deterministic ranking & top_k via monkeypatched encoders (no model download required).
  • Manual verification

    1. pip install -e integrations/fastembed
    2. Run python examples/fastembed_colbert_reranker.py
      • First run downloads the ColBERT model via FastEmbed (cached afterward).
      • Verified that documents mentioning “late interaction/ColBERT” rank at the top.

Notes for the reviewer

  • Lives entirely under integrations/fastembed/src/... (no changes to core Haystack).
  • Warm-up handles minor API differences in FastEmbed by retrying without unknown kwargs.
  • Intended usage: rerank ~100–500 candidates from a first-stage retriever (BM25 or dense).
  • Example import path:
    from haystack_integrations.components.rankers.fastembed import FastembedColbertReranker
    

xoaryaa avatar Oct 30 '25 07:10 xoaryaa