Keshav Santhanam
Keshav Santhanam
Could you re-run and set the environment variable `COLBERT_LOAD_TORCH_EXTENSION_VERBOSE=True`? It's possible that you need to erase your torch extensions cache to enable the Torch extension code to compile.
Can you try removing this folder and running again? `/home/zzh/.cache/torch_extensions/py38_cu113`
The `search` function in ColBERT accepts a `pids` argument which can be used to rank only the given documents.
Apologies for the delay in getting back to you - the existing metadata filtering feature is given by the `filter_fn` parameter in `Searcher.search` and `Searcher.search_all`. This user-defined `filter_fn` takes as...
The only way to do this in ColBERT would be to pre-filter (https://www.pinecone.io/learn/vector-search-filtering/) the passages which meet the metadata-based filter and then treat those as the candidate passages. There's currently...
The method I proposed would do a brute-force kNN search on the passed-in candidate passages, though if implemented correctly this method would still benefit from the relevance score approximation optimizations...
@VThejas want to try this out? @okhat correctly pointed out that this should reduce latency for reranking
I think this was auto-requested? Not sure how the new CI works yet, but no explicit need for mamba review here.
> LGTM but are there any perf implications of doing the de-tokenization on the engine side rather than the client side? @kanz-nv Will this overhead go away with async scheduling...