Results 20 comments of Keshav Santhanam

Could you re-run and set the environment variable `COLBERT_LOAD_TORCH_EXTENSION_VERBOSE=True`? It's possible that you need to erase your torch extensions cache to enable the Torch extension code to compile.

Can you try removing this folder and running again? `/home/zzh/.cache/torch_extensions/py38_cu113`

The `search` function in ColBERT accepts a `pids` argument which can be used to rank only the given documents.

Apologies for the delay in getting back to you - the existing metadata filtering feature is given by the `filter_fn` parameter in `Searcher.search` and `Searcher.search_all`. This user-defined `filter_fn` takes as...

The only way to do this in ColBERT would be to pre-filter (https://www.pinecone.io/learn/vector-search-filtering/) the passages which meet the metadata-based filter and then treat those as the candidate passages. There's currently...

The method I proposed would do a brute-force kNN search on the passed-in candidate passages, though if implemented correctly this method would still benefit from the relevance score approximation optimizations...

@VThejas want to try this out? @okhat correctly pointed out that this should reduce latency for reranking

I think this was auto-requested? Not sure how the new CI works yet, but no explicit need for mamba review here.

> LGTM but are there any perf implications of doing the de-tokenization on the engine side rather than the client side? @kanz-nv Will this overhead go away with async scheduling...