ContEA icon indicating copy to clipboard operation
ContEA copied to clipboard

Missing entity pairs

Open hvthaibk opened this issue 1 year ago • 2 comments

Hello,

Thanks for sharing the code and datasets. They are very helpful. I've discovered that there are some pairs in the original ground-truth are missing in your training, validation, and testing pairs.

For example, both 19156 and 34323 do not appear in

  • ContEA/datasets/FR-EN/base/train_links
  • ContEA/datasets/FR-EN/base/valid_links
  • ContEA/datasets/FR-EN/base/test_links

While they do appear in ContEA/datasets/FR-EN/ent _dict file as

  • 19156: http://fr.dbpedia.org/resource/Université_Lille_I
  • 34323: http://dbpedia.org/resource/Lille_University_of_Science_and_Technology

And the pair ("http://fr.dbpedia.org/resource/Université_Lille_I", "http://dbpedia.org/resource/Lille_University_of_Science_and_Technology") does exist in the original ground-truth.

There are 329 such pairs for FR-EN dataset. Could you please double check? Thanks!

hvthaibk avatar Apr 18 '23 12:04 hvthaibk