spacyfishing
spacyfishing copied to clipboard
503 Error for spaCy fishing
Describe the bug The Entity Fishing pipeline component is not returning any results from wikidata
To Reproduce Here is the code
`
import spacy import pandas as pd nlp = spacy.load('en_core_web_sm') nlp.add_pipe("merge_entities") nlp.add_pipe("entityfishing", config={"extra_info": True})
text = """
Tania’s story began in the UK in the summer of 2000
Tania’s story began in the United Kingdom in the summer of 2000
Tania’s story began in Great Britain in the summer of 2000
Tania’s story began in GBR in the summer of 2000\
""" doc = nlp(text)
display(pd.DataFrame( [ { 'Named Entity': ent.root.text, 'Label': ent.root.ent_type_, 'OntoNotes Description': spacy.explain(ent.root.ent_type_), 'Wikidata ID': ent..kb_qid, 'Nerd Score': ent..nerd_score, 'Normal term': ent..normal_term, } for ent in doc.ents if ent.label == "GPE" ] ))
print(doc..annotations) print(doc..metadata) `
And here are the results:
Expected behavior I would expect the above code to return a wikidata ID for the named entities in the text
Desktop (please complete the following information):
- OS: MacOS 14
- Python version: Python 3.9.13
- SpaCy version: 3.4.3
- spacyfishing version: 0.1.8
Additional context Add any other context about the problem here.
I have the same issue of not getting any results from Wikidata.
I tried running the first example in the README: "Simple example". The expected output is
('Victor Hugo', 'PERSON', 'Q535', 'https://www.wikidata.org/wiki/Q535', 0.972)
('Honoré de Balzac', 'PERSON', 'Q9711', 'https://www.wikidata.org/wiki/Q9711', 0.9724)
('French', 'NORP', 'Q121842', 'https://www.wikidata.org/wiki/Q121842', 0.3739)
('Paris', 'GPE', 'Q90', 'https://www.wikidata.org/wiki/Q90', 0.5652)
but the result I'm getting is:
('Victor Hugo', 'PERSON', None, None, None)
('Honoré de Balzac', 'PERSON', None, None, None)
('French', 'NORP', None, None, None)
('Paris', 'GPE', None, None, None)
Thank you for help on resolving this issue.
I just checked the closed issues and this is related to https://github.com/Lucaterre/spacyfishing/issues/12 The solution worked for me