gerbil icon indicating copy to clipboard operation
gerbil copied to clipboard

Add KORE-50 DYWC dataset

Open RicardoUsbeck opened this issue 5 years ago • 2 comments

https://www.aclweb.org/anthology/2020.lrec-1.291.pdf

RicardoUsbeck avatar May 31 '20 13:05 RicardoUsbeck

They took the KORE-50 dataset on DBpedia and built 3 new datasets where they annotated the sentences using Wikidata, YAGO, and Crunchbase. The KORE-50 DBpedia dataset is already in GERBIL.

Since the KORE-50 Wikidata and YAGO datasets contain additional entities I could add these to the existing KORE-50, but this would lead to repeatability problems of experiments. The other option could be to also add them all into one dataset and then add it as a new dataset KORE-50 DYWC. Then we would have KORE-50 and KORE-50 DYWC in GERBIL but they would have an overlap of course. (see Table 3 and Table 5 in the paper)

There was a similar discussion here (#170) regarding other datasets.

LukasBluebaum avatar Sep 11 '20 13:09 LukasBluebaum

Hi, thanks for looking into the issue! I would like to have KORE-50 and KORE-50 DYWC in GERBIL side-by-side :)

RicardoUsbeck avatar Sep 11 '20 18:09 RicardoUsbeck