ir_datasets issues

File structure stated in msmarco_passage.py is not aligned with downloaded top1000.dev.tar.gz

1

**Describe the bug** In `msmarco_passage.py` line 199-204, the `dev/small` dataset was: ``` subsets['dev/small'] = Dataset( collection, TsvQueries(Cache(TarExtract(dlc['collectionandqueries'], 'queries.dev.small.tsv'), base_path/'dev/small/queries.tsv'), namespace='msmarco', lang='en'), TrecQrels(Cache(TarExtract(dlc['collectionandqueries'], 'qrels.dev.small.tsv'), base_path/'dev/small/qrels'), QRELS_DEFS), TrecScoredDocs(Cache(ExtractQidPid(TarExtract(dlc['dev/scoreddocs'], 'top1000.dev')), base_path/'dev/ms.run')), ) ```...

yuenherny

bug

neuclir23

1

seanmacavaney

TREC CaST

21

This pull request is for TREC CaST 2019 to 2022 ## Generic classes For the moment, it contains generic handler classes that might be moved in other places (and need...

bpiwowar

Clueweb22

24

I'd like to keep this PR as a way of tracking progress of the ir_datasets integration for ClueWeb22. Of course, the implementation is far from finished (as you can see...

janheinrichmerker

Unified getter for the relevance level

1

**Is your feature request related to a problem? Please describe.** ir_datasets centralizes a lot of information about datasets. However, when using evaluation measures with binary levels (like MAP, MRR, ...),...

TheMrSheldon

enhancement

BioASQ

Following this [issue](https://github.com/allenai/ir_datasets/issues/250), I'm submitting this PR to add BioASQ into ir-datasets.

MathVast

Cannot read LoTTE docs

**Describe the bug** There seems to be an issue when downloading/reading the lotte datasets. **Affected dataset(s)** LoTTE **To Reproduce** Run in Python: ``` import ir_datasets dataset = ir_datasets.load("lotte/recreation/test") for doc...

ftvalentini

bug

Add BioASQ dataset to the list of supported BEIR datasets

2

Hi @seanmacavaney I would like to use the BioASQ dataset for an experiment and I have stumbled across this on the GitHub repo of the BEIR paper [beir-cellar](https://github.com/beir-cellar/beir/issues/86#issuecomment-1548959460) where the...

MathVast

add-dataset

Add test dataset for trec tip of the tongue dataset

1

The test queries for the trec tot task are now available, and this pull request adds them in a similar way to how they are added for dev and train...

mam10eks

LongEval Retrieval (used at CLEF 2023)

6

**Dataset Information:** The goal would be to integrate the data of LongEval for the task 1 on retrieval. The information from the [official task description](https://clef-longeval.github.io/tasks/): ``` The goal of Task...

mam10eks

add-dataset

ir_datasets
ir_datasets copied to clipboard

Metadata

File structure stated in msmarco_passage.py is not aligned with downloaded top1000.dev.tar.gz

neuclir23

TREC CaST

Clueweb22

Unified getter for the relevance level

BioASQ

Cannot read LoTTE docs

Add BioASQ dataset to the list of supported BEIR datasets

Add test dataset for trec tip of the tongue dataset

LongEval Retrieval (used at CLEF 2023)

← Metadata

Owner

Metadata

ir_datasets ir_datasets copied to clipboard

Metadata

← Metadata

Owner

Metadata

ir_datasets
ir_datasets copied to clipboard