issues about the 1M datasets
I am very interested in your work, but I found that the entities number is unequal to those shown in the paper, e.g. in the EN-FR dataset, the paper shows that EN has 1,877,793 entities, but in the real dataset, I found EN has only 1275304 entities. Is there anything wrong?
This is indeed a problem with the data as I investigated today. I am thinking of some merging process caused the problem (This is because we wanted to use both ILLs in DBpedia and sameas.org). I am trying to recall how the data was built, but it was too long ago. So I need some time to figure it out. I will get back to you as soon as I find something.
Hello, is this issue solved?