mteb icon indicating copy to clipboard operation
mteb copied to clipboard

Add hyperparameters (k1, b) support to BM25 implementation

Open whybe-choi opened this issue 2 months ago • 4 comments

The current BM25 implementation in mteb/models/model_implementations/bm25.py doesn't support hyperparameter tuning for k1 and b parameters. It seems that only the default settings are being used at line 65.

https://github.com/embeddings-benchmark/mteb/blob/16ae6ff9cdc44cb1e2ce9dfa73155ec50cf77dd3/mteb/models/model_implementations/bm25.py#L45-L67

I think it would be better to support searching with different hyperparameter settings as well. May I try working on this?

whybe-choi avatar Oct 24 '25 09:10 whybe-choi

I think we need to decide firstly what to do with experiments https://github.com/embeddings-benchmark/mteb/issues/1211. I will try to work on this on next week. But locally you can pass these arguments, but for now I don't think we should do this for reproducibility

Samoed avatar Oct 24 '25 09:10 Samoed

Yeah, so do we want these different-tuned versions on the leaderboard? Those are perfectly fine to explore locally (the leaderboard can't have everything).

KennethEnevoldsen avatar Oct 27 '25 10:10 KennethEnevoldsen

I'm not sure about leaderboard too, but we should save information about them with some different folder structure. Also, part of https://github.com/embeddings-benchmark/mteb/issues/3369

Samoed avatar Oct 27 '25 10:10 Samoed

Yeah. I would simply say that this closes this issue (#3369 is still a concern)

KennethEnevoldsen avatar Oct 27 '25 10:10 KennethEnevoldsen