stanza
stanza copied to clipboard
Improve error message for missing annotators
I have the following MWE:
from stanfordnlp.server import CoreNLPClient
text = 'Barack Obama was born in the Hawaii. He was the president of the United States. '
prop = {'annotators': 'coref', 'coref.algorithm' : 'neural'}
with CoreNLPClient(properties=prop, timeout=60000, memory='16G', quietsss=False) as client:
ann = client.annotate(text)
with the variable CORENLP_HOME properly defined. But the code crashes:
D:\data\progetti_miei\corenlp_coref\stanfordnlp_official>python test_mine_bugreport.py
Starting server with command: java -Xmx16G -cp D:\data\programmi\StanfordCoreNLP\stanford-corenlp-full-2018-10-05/* edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 60000 -threads 5 -maxCharLength 100000 -quiet True -serverProperties corenlp_server-fd10ce87f3414e2b.props -preload coref
Traceback (most recent call last):
File "D:\data\programmi\Python37\lib\site-packages\stanfordnlp\server\client.py", line 330, in _request
r.raise_for_status()
File "D:\data\programmi\Python37\lib\site-packages\requests\models.py", line 940, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: http://localhost:9000/?properties=%7B%27outputFormat%27%3A+%27serialized%27%7D
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "test_mine_bugreport.py", line 5, in <module>
ann = client.annotate(text)
File "D:\data\programmi\Python37\lib\site-packages\stanfordnlp\server\client.py", line 398, in annotate
r = self._request(text.encode('utf-8'), request_properties, **kwargs)
File "D:\data\programmi\Python37\lib\site-packages\stanfordnlp\server\client.py", line 336, in _request
raise AnnotationException(r.text)
stanfordnlp.server.client.AnnotationException: java.util.concurrent.ExecutionException: java.lang.NullPointerException
If I use the statistical processor instead of the neural one, the code works as exepcted.
It was due to missing annotators which are required by the coref annotator. In fact running the equivalent in CoreNLP gives:
java -Xmx5g -cp stanford-corenlp-3.9.2.jar;stanford-corenlp-3.9.2-models.jar;* edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators coref -coref.algorithm neural -file example_file.txt
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator coref
[main] INFO edu.stanford.nlp.coref.neural.NeuralCorefAlgorithm - Loading coref model edu/stanford/nlp/models/coref/neural/english-model-default.ser.gz ... done [0.4 sec].
[main] INFO edu.stanford.nlp.coref.neural.NeuralCorefAlgorithm - Loading coref embeddings edu/stanford/nlp/models/coref/neural/english-embeddings.ser.gz ... done [0.4 sec].
[main] INFO edu.stanford.nlp.pipeline.CorefMentionAnnotator - Using mention detector type: rule
Exception in thread "main" java.lang.IllegalArgumentException: annotator "coref" requires annotation "BasicDependenciesAnnotation". The usual requirements for this annotator are: tokenize,ssplit,pos,lemma,ner,depparse
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:260)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:192)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:188)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.main(StanfordCoreNLP.java:1388)
but the error message is much more useful, as it directly tells you what is the problem and which annotators are missing. While the Python wrapper just gives a more obscure NullPointerException