Jimmy Lin
Jimmy Lin
sure! Issue noted and PR welcome - but this is lowish on our priority list to fix...
Can you isolate the troublesome record?
So, if the error comes from a 3rd party lib, we should just eat the exception and move on? Can we build this into a test case?
> 1. How is p(M_d|q) computed? That is, how does a document's weight in the RM1 model is defined when the initial retrieval is BM25? https://github.com/castorini/anserini/blob/master/src/main/java/io/anserini/rerank/lib/Rm3Reranker.java#L83 > 2. After receiving...
Sometimes this is a transient error... can you try again?
Ref #1038 - initial steps in that PR.
Hrm... that's pretty heavyweight and requires an external dependency. I suppose for catB everything can fit in memory. Perhaps we can assume the same for catA? 500m \* ( 2...
I'd rather index everything and use spam as a feature during retrieval. That way we don't need to develop a cutoff.
Just a big hashmap we load into memory at startup? Using, fastuil, for example?
How much memory do you have on your machine? The machine I use at UMD has 0.75 TB RAM :)