Lily Ge
Lily Ge
I have a 'working' branch [here](https://github.com/lilyjge/anserini/tree/rerank), but there's some problems/limitations that makes it not very useful: 1. The dense queries in tools only have the vector embeddings, not the query...
RRF implementation differences: - Lucene uses TopDocs, which holds an array of ScoredDoc (each representing one hit) instead of ScoredDocs; every TopDocs is for one query - Since every TopDocs...
Is there a way to do so directly in Anserini? Or do I have to install Lucene itself?
Yep. To summarize, current Anserini impl is faster than Lucene. #2782 can be closed as well.
Yes, due to an issue building the commands in the script I used, the Lucene implementation wasn't actually running and the time was just for evaluating the existing runs. Here...
Yes, it is great! Syntax is simple and easy to use. Attached is a replica of[ this diagram ](https://github.com/castorini/pyserini/blob/master/docs/images/architecture-biencoder.png) I was able to whip up really quickly. D2 supports icons...
For the earlier diagram: ``` Documents -> doc_encoded: Doc Encoder doc_encoded: "[...]\n[...]\n[...]" q_encoded: "[...]" Query Encoder -> q_encoded: Query Encoder q_encoded -> Top-k Retrieval Ranked List ``` Something like this...
I can work on this!
With how retrieval functionalities already exist with MCPyserini, we have several options on how to do this: - Copying over the existing MCPyserini controller logic + new RankLLM tools in...
For caching to HF and splade, do we still want to use SPLADE++_EnsembleDistil_ONNX? Or update to SpladeV3?