pyserini icon indicating copy to clipboard operation
pyserini copied to clipboard

Why to parse the topic keys into integers

Open ChuanMeng opened this issue 2 years ago • 5 comments

Hi, I have a question that is why to parse the topic keys into integers? In my case, I run BM25 on Trec Cast datasets where the topic key is like "30_1", and this parsing operation would modify the topic key into "301". The modified topic keys would mismatch the ones in the qrels file, which I don't want to happen. How to disable this operation?

https://github.com/castorini/pyserini/blob/4d61b56319833e9111801085e0b4e39f87d31bb7/pyserini/search/_base.py#L461

I look forward to your reply and thank you in advance.

Cheers, Chuan

ChuanMeng avatar Nov 17 '22 13:11 ChuanMeng