Lily Ge comments

Results 21 comments of


                                            Lily Ge

Implement -outputRerankerRequests for SearchHnswDenseVectors

I have a 'working' branch [here](https://github.com/lilyjge/anserini/tree/rerank), but there's some problems/limitations that makes it not very useful: 1. The dense queries in tools only have the vector embeddings, not the query...

Look at rrf impl in Lucene

RRF implementation differences: - Lucene uses TopDocs, which holds an array of ScoredDoc (each representing one hit) instead of ScoredDocs; every TopDocs is for one query - Since every TopDocs...

Look at rrf impl in Lucene

Is there a way to do so directly in Anserini? Or do I have to install Lucene itself?

Look at rrf impl in Lucene

Yep. To summarize, current Anserini impl is faster than Lucene. #2782 can be closed as well.

Refactor fusion referencing Lucene implementation for speed

Yes, due to an issue building the commands in the script I used, the Lucene implementation wasn't actually running and the time was just for evaluating the existing runs. Here...

Try out d2lang to visualize the onboarding path

Yes, it is great! Syntax is simple and easy to use. Attached is a replica of[ this diagram ](https://github.com/castorini/pyserini/blob/master/docs/images/architecture-biencoder.png) I was able to whip up really quickly. D2 supports icons...

Try out d2lang to visualize the onboarding path

For the earlier diagram: ``` Documents -> doc_encoded: Doc Encoder doc_encoded: "[...]\n[...]\n[...]" q_encoded: "[...]" Query Encoder -> q_encoded: Query Encoder q_encoded -> Top-k Retrieval Ranked List ``` Something like this...

MCP for rankllm

I can work on this!

MCP for rankllm

With how retrieval functionalities already exist with MCPyserini, we have several options on how to do this: - Copying over the existing MCPyserini controller logic + new RankLLM tools in...

bright integration

For caching to HF and splade, do we still want to use SPLADE++_EnsembleDistil_ONNX? Or update to SpladeV3?