cedr
cedr copied to clipboard
Run files of Vanilla BERT checkpoints do not match test folds in data/robust
First of all, thanks a lot for your interesting work on CEDR and for the code in this repository.
I downloaded the Vanilla BERT and CEDR-KNRM checkpoints from #18 and checked the query ids in the .run files contained in the downloaded archive. While the sets of query ids in cedrknrm-robust-f[1-5].run match those in data/robust/f[1-5].test.run, the sets of query ids in vbert-robust-f[1-5].run do not match those in data/robust/f[1-5].test.run (e.g. the set of query ids in vbert-robust-f1.run is different from the set of query ids in data/robust/f1.test.run, and also cedrknrm-robust-f1.run).
Why are the folds for Vanilla BERT and CEDR-KNRM different? On which folds have the Vanilla BERT checkpoints been trained/validated? Given that the test folds of the Vanilla BERT and CEDR-KNRM checkpoints are different I assume that the provided Vanilla BERT checkpoints have not been used as initial weights for obtaining the provided CEDR-KNRM checkpoints. Is this assumption correct? If yes, which Vanilla BERT checkpoints have been used to initialize CEDR-KNRM training? Do you mind sharing these checkpoints too?
I'm currently investigate issues reproducing the results published in the paper. More on that in a separate ticket ...
To be more precise regarding
e.g. the set of query ids in
vbert-robust-f1.runis different from the set of query ids indata/robust/f1.test.run, and alsocedrknrm-robust-f1.run
the number of common query ids in vbert-robust-f1.run and data/robust/f{x}.test.run for x = 1..5 is:
x = 1: 7x = 2: 13x = 3: 12x = 4: 7x = 5: 11
Hi Marin,
Thanks for pointing out this inconsistency! I suspect that it can be explained by a mismatch between the original code used for running the experiments (which reflect the vbert-robust-f1.run files), and the simplified example we released here. Specifically, I'm thinking it may have been a problem with the code that exported the data/robust/f{x}.test.run files from the original source. But I'll need to spend some time digging into exactly what happened.
- sean