kgt5
kgt5 copied to clipboard
Question about KGC datasets
Hi,
Thanks for the great work! Could you also share the KGC datasets fb15k-237 and WN18RR (the same format as wikidata5m)? BTW, I also see the dataset codex-m in the shared data but did not find results in your paper. Did you also conduct experiments on codex-m?
Hi Apoorv, Thanks for the great work! I can't find the entity_strings.txt from data/wikidata5m/entity_strings.txt but the code need it.How can I get it?
Hi thanks for your interest.
@PlusRoss We did some preliminary experiments but unfortunately we don't have any final numbers to share
@2682989487 You can run the script https://github.com/apoorvumang/kgt5/blob/main/data/get_unique_entities.py to get the entity strigns
Hi Aporv, I found that after running get_unique_entities. py, there will be some samples with three | in the training data, causing the program to fail. This problem was mentioned in question 2 #2 ,Has this problem been solved?
Hi, please see https://github.com/apoorvumang/kgt5/issues/18#issuecomment-1227189777 for the updated mappings from wikidata ID to text. You can use these to reconstruct the dataset. I would recommend using these.
I will update the readme soon to point to these mappings instead of our earlier kgc dataset. Let me know if this works for you or not
In #2 , I encountered a solved problem, that is, I could not get the entity string from wikidata5m. I seem to be using an old dataset. I will try to update the mapping from wikidata ID to text, thank you for your answer, and look forward to the update of the readme file.