QuickUMLS
QuickUMLS copied to clipboard
about `nlp.add_pipe` in the demo
Describe the bug
when i run the demo code,there something wrong about “nlp.add_pipe(quickumls_component)”,
Traceback (most recent call last):
File "umlsdemo.py", line 8, in nlp.add_pipe
now takes the string name of the registered component factory, not a callable component. Expected string, but got <quickumls.spacy_component.SpacyQuickUMLS object at 0x7f24b35e5cd0> (name: 'None').
-
If you created your component with
nlp.create_pipe('name')
: remove nlp.create_pipe and callnlp.add_pipe('name')
instead. -
If you passed in a component like
TextCategorizer()
: callnlp.add_pipe
with the string name instead, e.g.nlp.add_pipe('textcat')
. -
If you're using a custom component: Add the decorator
@Language.component
(for function components) or@Language.factory
(for class components / factories) to your custom component and assign it a name, e.g.@Language.component('your_name')
. You can then runnlp.add_pipe('your_name')
to add it to the pipeline.
To Reproduce
**Environment **
- OS: [Unbunt]
- QuickUMLS version 1.4.0 post1
- UMLS version 2021AB
- spacy 3.2.0
Additional context it seems relate to spacy accroding to https://stackoverflow.com/questions/67906945/valueerror-nlp-add-pipe-now-takes-the-string-name-of-the-registered-component-f while i still don't konw how to modify the code~~
It can be used like this.
import spacy
from spacy.language import Language
from quickumls.spacy_component import SpacyQuickUMLS
@Language.component('quickumls_component')
def quickumls_component(doc):
return SpacyQuickUMLS(nlp, <Path to quickUmls install dir>)(doc)
nlp.add_pipe('quickumls_component', last=True)
doc = nlp(full_rpts.iloc[0])
Hi everyone,
When I using this code I got the this error
[[E090] Extension 'similarity' already exists on Span. To overwrite the existing extension, set
force=Trueon
Span.set_extension.]
@shrimonmuke0202 did you solve this problem?? [[E090] Extension 'similarity' already exists on Span. To overwrite the existing extension, set force=TrueonSpan.set_extension.]
It can be used like this.
import spacy from spacy.language import Language from quickumls.spacy_component import SpacyQuickUMLS @Language.component('quickumls_component') def quickumls_component(doc): return SpacyQuickUMLS(nlp, <Path to quickUmls install dir>)(doc) nlp.add_pipe('quickumls_component', last=True) doc = nlp(full_rpts.iloc[0])
Hi there, thank you so much for sharing a solution! I was able to get past the add_pipe error but not further. Could you explain what the line of code on doc = nlp(full_rpts.iloc[0]) does? I was trying to put into something like doc = nlp('Pt c/o shortness of breath, chest pain, nausea, vomiting, diarrrhea') but that does not work. Initially I tried copy pasting your code entirely, but it returns the error saying "full_rpts" is not defined - is there some missing context here about this line of code? Thank you so much!
It can be used like this.
import spacy from spacy.language import Language from quickumls.spacy_component import SpacyQuickUMLS @Language.component('quickumls_component') def quickumls_component(doc): return SpacyQuickUMLS(nlp, <Path to quickUmls install dir>)(doc) nlp.add_pipe('quickumls_component', last=True) doc = nlp(full_rpts.iloc[0])
Hi there, thank you so much for sharing a solution! I was able to get past the add_pipe error but not further. Could you explain what the line of code on doc = nlp(full_rpts.iloc[0]) does? I was trying to put into something like doc = nlp('Pt c/o shortness of breath, chest pain, nausea, vomiting, diarrrhea') but that does not work. Initially I tried copy pasting your code entirely, but it returns the error saying "full_rpts" is not defined - is there some missing context here about this line of code? Thank you so much!
full_rpts.iloc[0] returns a string from pandas dataframe, so doc = nlp('Pt c/o shortness of breath, chest pain, nausea, vomiting, diarrrhea')
is correct. Did you update the UMLS install location in the code below?
def quickumls_component(doc):
return SpacyQuickUMLS(nlp, <Path to quickUmls install dir>)(doc)
It can be used like this.
import spacy from spacy.language import Language from quickumls.spacy_component import SpacyQuickUMLS @Language.component('quickumls_component') def quickumls_component(doc): return SpacyQuickUMLS(nlp, <Path to quickUmls install dir>)(doc) nlp.add_pipe('quickumls_component', last=True) doc = nlp(full_rpts.iloc[0])
Is the Path to quickUmls install dir supposed to be the same as quickumls_fp in this code block?
matcher = QuickUMLS(quickumls_fp, ...)
If so, I am doing this yet get this message:
Loading QuickUMLS resources from a default SAMPLE of UMLS data from here: /opt/conda/envs/python38/lib/python3.8/site-packages/resources/quickumls/QuickUMLS_SAMPLE_lowercase_POSIX_unqlite
and no output from the print statements from the code in OP's block
However, this works fine
# Initialize QuickUMLS matcher
matcher = QuickUMLS("./libraries/quickumls", "score", 0.99)
def quick_UMLS_match(medical_text):
if len(medical_text) > 1000000:
processed_text = medical_text[:1000000]
else:
processed_text = medical_text
return matcher.match(processed_text, best_match=True, ignore_syntax=False)
But I am trying to implement medspacy as I extract items from the QuickUMLS output in a super inneficient way and this seems like the proper way. For what it's worth, this is how I do it:
def quick_UMLS_extractor(matcher_output, return_field, unique=True):
return_items = [entity[return_field] for sublst in matcher_output for entity in sublst]
if unique:
return_items = list(set(return_items))
return return_items
else:
return return_items