Hippolyte Gisserot-Boukhlef
Hippolyte Gisserot-Boukhlef
Hello Imene and thanks Manu for the explanations! Here are some further precisions regarding your questions. 1. Is the main idea about discarding uncertain documents in a search/IR pipeline? Not...
Hi @orionw and thanks for your comment! I will try to clarify things a bit. The question we are trying to address with abstention is not to evaluate confidence at...
Hey @orionw, I suggest we look at a concrete example! Let’s assume we have a query Q and want to retrieve the top-5 documents using a retrieval system R. We...
Hi @orionw , regarding your follow-up questions: _-> So in essence, this is a metric computing the area under the nDCG@k score curve at various confidence scores -- is that...
Hi @orionw, From what I understand, you are defending that to measure calibration, we should have a fixed range of thresholds that should not vary depending on the domain. By...
Hi @KennethEnevoldsen and many thanks for your remarks! We have incorporated abstention as an evaluation metric rather than as a task in this new PR: [https://github.com/embeddings-benchmark/mteb/pull/841](https://github.com/embeddings-benchmark/mteb/pull/841). For the moment, we...