pyserini
pyserini copied to clipboard
Why to parse the topic keys into integers
Hi, I have a question that is why to parse the topic keys into integers? In my case, I run BM25 on Trec Cast datasets where the topic key is like "30_1", and this parsing operation would modify the topic key into "301". The modified topic keys would mismatch the ones in the qrels file, which I don't want to happen. How to disable this operation?
https://github.com/castorini/pyserini/blob/4d61b56319833e9111801085e0b4e39f87d31bb7/pyserini/search/_base.py#L461
I look forward to your reply and thank you in advance.
Cheers, Chuan