stanford-corenlp-docker
stanford-corenlp-docker copied to clipboard
Does coreference annotator work in this image?
Have you managed to get the coref annotator to work? I'm seeing the image silently crashing when I bring up the GUI and add the coref annotator. Using latest corenlp, built as you suggest.
No crash with coref when I use the node package from https://github.com/gerardobort/node-corenlp, natively on an Intel Mac with openjdk-15.
wondering if we have a Java version issue, or a running out of memory, or something else.
I haven't used CoreNLP for coreference in a long time. If you can provide me with the exact commands that you ran, I'll try to reproduce the error.
Launch the server, exactly as in your readme
Go to the GUI at localhost:9000.
Type any sentence and hit Submit. E.g. the dog walks. It works and does Brat visualizatons
Click to add the coreference annoitator. Hit submit again. It tries to load models, and the docker process quits without returning anything. The GUI puts a red bar on the screen of your browser.
Make sense? If not, I can try with curl or something. But it will be the same, pretty sure.
Chris
On Wed, Oct 20, 2021 at 9:54 AM Arne Neumann @.***> wrote:
I haven't used CoreNLP for coreference in a long time. If you can provide me with the exact commands that you ran, I'll try to reproduce the error.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/NLPbox/stanford-corenlp-docker/issues/6#issuecomment-947690498, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFIW2QW7J7PSKQ6DVOO6SLUH3CXFANCNFSM5GLVFG2A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Get a server running on port 9000,
native is
npm explore corenlp -- npm run corenlp:server
You have to futz around a little, putting a CoreNLP distribution into npm_modules/corenlp/corenlp/stanford-core-nlp-4.3.0 and adjusting the server start script to match.
then do
wget --post-data 'John said he would come but he did not' 'localhost:9000/?properties={"annotators": "coref", "outputFormat": "json"}' -O - | jq .coref
Native server output is:
{
"2": [
{
"id": 0,
"text": "John",
"type": "PROPER",
"number": "SINGULAR",
"gender": "MALE",
"animacy": "ANIMATE",
"startIndex": 1,
"endIndex": 2,
"headIndex": 1,
"sentNum": 1,
"position": [
1,
1
],
"isRepresentativeMention": true
},
{
"id": 1,
"text": "he",
"type": "PRONOMINAL",
"number": "SINGULAR",
"gender": "MALE",
"animacy": "ANIMATE",
"startIndex": 3,
"endIndex": 4,
"headIndex": 3,
"sentNum": 1,
"position": [
1,
2
],
"isRepresentativeMention": false
},
{
"id": 2,
"text": "he",
"type": "PRONOMINAL",
"number": "SINGULAR",
"gender": "MALE",
"animacy": "ANIMATE",
"startIndex": 7,
"endIndex": 8,
"headIndex": 7,
"sentNum": 1,
"position": [
1,
3
],
"isRepresentativeMention": false
}
]
}
Using a docker image of 4.3.0 made with your file
docker run -p 9000:9000 corenlp
In docker window
[main] INFO CoreNLP - --- StanfordCoreNLPServer#main() called --- [main] INFO CoreNLP - Server default properties: (Note: unspecified annotator properties are English defaults) annotators = tokenize,ssplit,parse inputFormat = text outputFormat = json prettyPrint = false [main] INFO CoreNLP - Threads: 8 [main] INFO CoreNLP - Starting server... [main] INFO CoreNLP - StanfordCoreNLPServer listening at /0.0.0.0:9000 [pool-1-thread-1] INFO CoreNLP - [/172.17.0.1:64090] API call w/annotators tokenize,ssplit,pos,lemma,ner,depparse,coref John said he would come but he did not [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos [pool-1-thread-1] INFO edu.stanford.nlp.tagger.maxent.MaxentTagger - Loading POS tagger from edu/stanford/nlp/models/pos-tagger/english-left3words-distsim.tagger ... done [1.1 sec]. [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner [pool-1-thread-1] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [2.9 sec]. [pool-1-thread-1] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [1.2 sec]. [pool-1-thread-1] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [1.1 sec]. [pool-1-thread-1] INFO edu.stanford.nlp.time.JollyDayHolidays - Initializing JollyDayHoliday for SUTime from classpath edu/stanford/nlp/models/sutime/jollyday/Holidays_sutime.xml as sutime.binder.1. [pool-1-thread-1] INFO edu.stanford.nlp.time.TimeExpressionExtractorImpl - Using following SUTime rules: edu/stanford/nlp/models/sutime/defs.sutime.txt,edu/stanford/nlp/models/sutime/english.sutime.txt,edu/stanford/nlp/models/sutime/english.holidays.sutime.txt [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.TokensRegexNERAnnotator - ner.fine.regexner: Read 580705 unique entries out of 581864 from edu/stanford/nlp/models/kbp/english/gazetteers/regexner_caseless.tab, 0 TokensRegex patterns. [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.TokensRegexNERAnnotator - ner.fine.regexner: Read 4867 unique entries out of 4867 from edu/stanford/nlp/models/kbp/english/gazetteers/regexner_cased.tab, 0 TokensRegex patterns. [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.TokensRegexNERAnnotator - ner.fine.regexner: Read 585572 unique entries from 2 files [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.NERCombinerAnnotator - numeric classifiers: true; SUTime: true [no docDate]; fine grained: true [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator depparse [pool-1-thread-1] INFO edu.stanford.nlp.parser.nndep.DependencyParser - Loading depparse model: edu/stanford/nlp/models/parser/nndep/english_UD.gz ... Time elapsed: 1.2 sec [pool-1-thread-1] INFO edu.stanford.nlp.parser.nndep.Classifier - PreComputed 20000 vectors, elapsed Time: 2.701 sec [pool-1-thread-1] INFO edu.stanford.nlp.parser.nndep.DependencyParser - Initializing dependency parser ... done [3.9 sec]. [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator coref (venv) ~/working/examp/
And the docker process dies before returning anything.
In the query window, it goes like this. Which makes a good deal of sense given that the docker process dies.
` ~/Documents/GitHub/ wget --post-data 'John said he would come but he did not' 'localhost:9000/?properties={"annotators": "coref", "outputFormat": "json"}' -O - | jq .coref --2021-10-20 10:48:50-- http://localhost:9000/?properties=%7B%22annotators%22:%20%22coref%22,%20%22outputFormat%22:%20%22json%22%7D Resolving localhost (localhost)... ::1, 127.0.0.1 Connecting to localhost (localhost)|::1|:9000... connected. HTTP request sent, awaiting response... No data received. Retrying.
--2021-10-20 10:49:22-- (try: 2) http://localhost:9000/?properties=%7B%22annotators%22:%20%22coref%22,%20%22outputFormat%22:%20%22json%22%7D Connecting to localhost (localhost)|::1|:9000... failed: Connection refused. Connecting to localhost (localhost)|127.0.0.1|:9000... failed: Connection refused. Resolving localhost (localhost)... ::1, 127.0.0.1 Connecting to localhost (localhost)|::1|:9000... failed: Connection refused. Connecting to localhost (localhost)|127.0.0.1|:9000... failed: Connection refused. `
It seems that I added an ANNOTATORS
env to the Dockerfile
in 2021, but I never answered in this thread. I have changed the default to make CoreNLP always use all annotators, so running
docker buildx build -t corenlp https://github.com/NLPbox/stanford-corenlp-docker.git
docker run -p 9000:9000 corenlp
in one terminal and running your query in another should give you the desired result:
wget --post-data 'John said he would come but he did not' 'localhost:9000/?properties={"annotators": "coref", "outputFormat": "json"}' -O - | jq .corefs
{
"2": [
{
"id": 0,
"text": "John",
"type": "PROPER",
"number": "SINGULAR",
"gender": "MALE",
"animacy": "ANIMATE",
"startIndex": 1,
"endIndex": 2,
"headIndex": 1,
"sentNum": 1,
"position": [
1,
1
],
"isRepresentativeMention": true
},
{
"id": 1,
"text": "he",
"type": "PRONOMINAL",
"number": "SINGULAR",
"gender": "MALE",
"animacy": "ANIMATE",
"startIndex": 3,
"endIndex": 4,
"headIndex": 3,
"sentNum": 1,
"position": [
1,
2
],
"isRepresentativeMention": false
},
{
"id": 2,
"text": "he",
"type": "PRONOMINAL",
"number": "SINGULAR",
"gender": "MALE",
"animacy": "ANIMATE",
"startIndex": 7,
"endIndex": 8,
"headIndex": 7,
"sentNum": 1,
"position": [
1,
3
],
"isRepresentativeMention": false
}
]
}