Sean MacAvaney

Results 229 comments of Sean MacAvaney

This effects cases in which the batch is larger than 1. With the default settings, this is the case for training (gradient accumulation batch size of 2) and evaluation (batch...

Hi @Pourbahman, I recommend using a package like [OpenNIR](https://github.com/Georgetown-IR-Lab/OpenNIR) or [Capreolus](https://github.com/capreolus-ir/capreolus). This repository was meant to be as a simplification/demonstration of the main idea, rather than a comprehensive system for...

I don't recall trying it, but in [PARADE](https://arxiv.org/pdf/2008.09093.pdf) we identified some weirdness about the document ranking task that may explain what you're seeing. The dataset has a strong bias towards...

Thanks for the feedback and interest in this work. I am familiar with both of the papers you cited. For WebTrack, we measure using ERR@20 and nDCG@20. For Robust04, we...

Information on obtaining the two ClueWeb collections are found here: - https://lemurproject.org/clueweb09.php/ - https://lemurproject.org/clueweb12/ They are purchased from CMU and sent on hard drives. Unfortunately, they cannot be distributed by...

We cannot release the dataset directly due to the data usage agreement. However, I could provide a script that builds the file from the [ir-datasets](https://github.com/allenai/ir_datasets/) package, if that would help?...

The error says that the index was created with a newer Lucene version than the current software supports. I think you should be able to add a codecs JAR to...

Hi Martin! Thanks for reporting. I'm looking into these issues (as well as related #21).

Hi @krasserm -- sorry for the delays. I'm trying to balance a variety of priorities right now, and I have not had much time to dig into this.

Hi Marin, Thanks for pointing out this inconsistency! I suspect that it can be explained by a mismatch between the original code used for running the experiments (which reflect the...