scispacy icon indicating copy to clipboard operation
scispacy copied to clipboard

Entity-linking for other ontologies

Open BlakeList opened this issue 3 years ago • 6 comments

Hey there!

Firstly, thank you so much for all the hard work you all have put into scispacy! I really appreciate the addition of the transformer and entity-linking.

While I have experience using spacy for ner and rel, I am quite new to ontologies and building spacy entity-linkers/knowledge bases. Would it be possible to get another ontology (specifically, https://github.com/Planteome/plant-trait-ontology) added? Or a tutorial/description of how I could do this myself?

Thank you again, Blake

BlakeList avatar Mar 10 '21 02:03 BlakeList

@DeNeutoy could you point to how to add a new entity linker?

dakinggg avatar Mar 10 '21 18:03 dakinggg

Hi @BlakeList,

Creating your own entity linker is quite straightforward - there is one fiddly bit in how the linkers are registered with scispacy at the moment which is less than ideal, but you should be able to follow the instructions in this issue:

https://github.com/allenai/scispacy/issues/237

You can use this script: https://github.com/allenai/scispacy/blob/master/scripts/create_linker.py

to create the files for the linker. The only input you need is a json/jsonl file with objects which look like this class:

https://github.com/allenai/scispacy/blob/4ade4ec897fa48c2ecf3187caa08a949920d126d/scispacy/linking_utils.py#L12

Once you've generated the linker and tested it out, we can see about getting it integrated into scispacy, if you think it would be useful!

DeNeutoy avatar Mar 20 '21 11:03 DeNeutoy

Great! Thank you @DeNeutoy

Feel free to close 👍

BlakeList avatar Mar 21 '21 00:03 BlakeList

@BlakeList did you get it working? Let us know if everything worked well for you!

DeNeutoy avatar Mar 22 '21 19:03 DeNeutoy

Hi @DeNeutoy,

Sorry for the slow response, I am currently going down an alternative route. I have instead built an entity-ruler (and now an ner model using prodigy + spacy). The code above is quite straightforward, however, I was having some troubles understanding how ontology classes can be used with the umls format.

Is there a standard approach to convert between ontology rdfs to umls? Can ontology classes (e.g. TO:0000387 for plant trait) be used as the concept id?

Cheers, Blake

BlakeList avatar Mar 26 '21 04:03 BlakeList

Hi @BlakeList,

Creating your own entity linker is quite straightforward - there is one fiddly bit in how the linkers are registered with scispacy at the moment which is less than ideal, but you should be able to follow the instructions in this issue:

#237

You can use this script: https://github.com/allenai/scispacy/blob/master/scripts/create_linker.py

to create the files for the linker. The only input you need is a json/jsonl file with objects which look like this class:

https://github.com/allenai/scispacy/blob/4ade4ec897fa48c2ecf3187caa08a949920d126d/scispacy/linking_utils.py#L12

Once you've generated the linker and tested it out, we can see about getting it integrated into scispacy, if you think it would be useful!

Hi @DeNeutoy ,

I'm a bit confused! Are you able to provide a sample codebase that was used to create an entity linker component with a custom ontology and relevant resources?

viraj-lakshitha avatar Nov 21 '22 06:11 viraj-lakshitha