scispacy
scispacy copied to clipboard
Entity-linking for other ontologies
Hey there!
Firstly, thank you so much for all the hard work you all have put into scispacy! I really appreciate the addition of the transformer and entity-linking.
While I have experience using spacy for ner and rel, I am quite new to ontologies and building spacy entity-linkers/knowledge bases. Would it be possible to get another ontology (specifically, https://github.com/Planteome/plant-trait-ontology) added? Or a tutorial/description of how I could do this myself?
Thank you again, Blake
@DeNeutoy could you point to how to add a new entity linker?
Hi @BlakeList,
Creating your own entity linker is quite straightforward - there is one fiddly bit in how the linkers are registered with scispacy at the moment which is less than ideal, but you should be able to follow the instructions in this issue:
https://github.com/allenai/scispacy/issues/237
You can use this script: https://github.com/allenai/scispacy/blob/master/scripts/create_linker.py
to create the files for the linker. The only input you need is a json/jsonl file with objects which look like this class:
https://github.com/allenai/scispacy/blob/4ade4ec897fa48c2ecf3187caa08a949920d126d/scispacy/linking_utils.py#L12
Once you've generated the linker and tested it out, we can see about getting it integrated into scispacy, if you think it would be useful!
Great! Thank you @DeNeutoy
Feel free to close 👍
@BlakeList did you get it working? Let us know if everything worked well for you!
Hi @DeNeutoy,
Sorry for the slow response, I am currently going down an alternative route. I have instead built an entity-ruler (and now an ner model using prodigy + spacy). The code above is quite straightforward, however, I was having some troubles understanding how ontology classes can be used with the umls format.
Is there a standard approach to convert between ontology rdfs to umls? Can ontology classes (e.g. TO:0000387 for plant trait) be used as the concept id?
Cheers, Blake
Hi @BlakeList,
Creating your own entity linker is quite straightforward - there is one fiddly bit in how the linkers are registered with scispacy at the moment which is less than ideal, but you should be able to follow the instructions in this issue:
#237
You can use this script: https://github.com/allenai/scispacy/blob/master/scripts/create_linker.py
to create the files for the linker. The only input you need is a json/jsonl file with objects which look like this class:
https://github.com/allenai/scispacy/blob/4ade4ec897fa48c2ecf3187caa08a949920d126d/scispacy/linking_utils.py#L12
Once you've generated the linker and tested it out, we can see about getting it integrated into scispacy, if you think it would be useful!
Hi @DeNeutoy ,
I'm a bit confused! Are you able to provide a sample codebase that was used to create an entity linker component with a custom ontology and relevant resources?