ir_datasets TREC 2024 Tip-of-the-Tongue

Dataset Information:

The training and dev data of the TREC 2023 Tip-of-the-Tongue track are now available: https://trec-tot.github.io/guidelines

Description from the website:

Tip of the tongue: The phenomenon of failing to recall something from memory, combined with partial recall and the feeling that recall is imminent.

Links to Resources:

Test queries: https://zenodo.org/records/13370657/files/test-2024.zip?download=1
Corpus: https://zenodo.org/records/11185090/files/corpus.jsonl.zip?download=1

Dataset ID(s) & supported entities:

tip-of-the-tongue/2024: corpus
tip-of-the-tongue/2024/test: test queries

Checklist

Mark each task once completed. All should be checked prior to merging a new dataset.

[ ] Dataset definition (in ir_datasets/datasets/[topid].py)
[ ] Tests (in tests/integration/[topid].py)
[ ] Metadata generated (using ir_datasets generate_metadata command, should appear in ir_datasets/etc/metadata.json)
[ ] Documentation (in ir_datasets/etc/[topid].yaml)
- [ ] Documentation generated in https://github.com/seanmacavaney/ir-datasets.com/
[ ] Downloadable content (in ir_datasets/etc/downloads.json)
- [ ] Download verification action (in .github/workflows/verify_downloads.yml). Only one needed per topid.
- [ ] Any small public files from NIST (or other potentially troublesome files) mirrored in https://github.com/seanmacavaney/irds-mirror/. Mirrored status properly reflected in downloads.json.

Additional comments/concerns/ideas/etc.

Sep 16 '24 00:09 mam10eks

I think this should be rather fast, I think it should be easy to integrate this into the code of the previous year: https://github.com/allenai/ir_datasets/blob/master/ir_datasets/datasets/trec_tot.py

Sep 16 '24 00:09 mam10eks

I will try to make a pull request :)

Sep 16 '24 00:09 mam10eks

I have created a pull request with some tests here: https://github.com/allenai/ir_datasets/pull/272

As soon as this is merged, we could close the issue :)

Sep 22 '24 05:09 mam10eks

fixed with #272, sorry on the delay!

May 09 '25 14:05 seanmacavaney

ir_datasets ir_datasets copied to clipboard

TREC 2024 Tip-of-the-Tongue

ir_datasets
ir_datasets copied to clipboard