Niklas
Niklas
Just writing one line and CTRL+C (w/o Enter) yields the below for me. I think there is some batching issue. ```python During handling of the above exception, another exception occurred:...
Opened a PR with some changes: https://github.com/younesbelkada/bigscience/pull/1/
> Hey man, Great Work. There are total `98 file changes` in this pull request. I am trying to follow this pull request but it's a big one actually. Can...
This is pretty important as it would mean the BM25 + Reranking results overstate its actual performance
> Have a look at this code example: > > https://github.com/beir-cellar/beir/blob/c3334fd5b336dba03c5e3e605a82fcfb1bdf667d/examples/retrieval/evaluation/reranking/evaluate_bm25_ce_reranking.py#L63 > > It doesn't use the [EvaluateRetrieval.rerank](https://github.com/beir-cellar/beir/blob/c3334fd5b336dba03c5e3e605a82fcfb1bdf667d/beir/retrieval/evaluation.py#L25) method, instead it uses the [Rerank.rerank](https://github.com/beir-cellar/beir/blob/c3334fd5b336dba03c5e3e605a82fcfb1bdf667d/beir/reranking/rerank.py#L14) method. Ah great, thanks! 👍
> Yes, I did confirm their equivalence a while ago **for some tasks only**. It could be the difference in the number of final results we return: > > *...
Yeah I was leaving corpus_size as the default. The code is [here](https://github.com/Muennighoff/mtebscripts/blob/d0dc59d8f6ee3cb64e5b81ad8489b8716dc1dbb6/run_array.py#L152). Thanks for investigating it!
> I'll try to debug this weird behaviour and make a more stable DRPES. But for now please refrain from using it for your experiments @Muennighoff. Thank you for raising...
The dataset download is currently failing. It works for me on Colab. Probably adding `ignore_verifications=True`. to the load_dataset used in the test would make it work.
I think we want to do 2 datasets: 1) English prompts only 2) As many multilingual prompts as possible So I think it's fine to merge this & we'll just...