CoreNLP
CoreNLP copied to clipboard
edu.stanford.nlp.semgraph.UnknownVertexException
When I parse text with CoreNLP 4.3.0 in a concurrently threaded task, the parser throws an exception:
java.util.concurrent.ExecutionException: edu.stanford.nlp.semgraph.UnknownVertexException: Operation attempted on unknown vertex felt/VBD in graph -> was/VBD (root)
-> all/DT (advmod)
-> After/IN (case)
-> ,/, (punct)
-> Dana/NNP (nsubj)
-> there/RB (advmod)
-> right/RB (advmod)
-> next/JJ (advmod)
-> me/PRP (obl)
-> to/IN (case)
-> ./. (punct)
Caused by: edu.stanford.nlp.semgraph.UnknownVertexException: null
at edu.stanford.nlp.semgraph.SemanticGraph.parentPairs(SemanticGraph.java:730)
at edu.stanford.nlp.semgraph.semgrex.GraphRelation$DEPENDENT$1.advance(GraphRelation.java:325)
at edu.stanford.nlp.semgraph.semgrex.GraphRelation$SearchNodeIterator.initialize(GraphRelation.java:1103)
at edu.stanford.nlp.semgraph.semgrex.GraphRelation$SearchNodeIterator.<init>(GraphRelation.java:1084)
at edu.stanford.nlp.semgraph.semgrex.GraphRelation$DEPENDENT$1.<init>(GraphRelation.java:310)
at edu.stanford.nlp.semgraph.semgrex.GraphRelation$DEPENDENT.searchNodeIterator(GraphRelation.java:310)
at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.resetChildIter(NodePattern.java:337)
at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.<init>(NodePattern.java:332)
at edu.stanford.nlp.semgraph.semgrex.NodePattern.matcher(NodePattern.java:293)
at edu.stanford.nlp.semgraph.semgrex.CoordinationPattern$CoordinationMatcher.<init>(CoordinationPattern.java:146)
at edu.stanford.nlp.semgraph.semgrex.CoordinationPattern.matcher(CoordinationPattern.java:120)
at edu.stanford.nlp.semgraph.semgrex.CoordinationPattern$CoordinationMatcher.<init>(CoordinationPattern.java:146)
at edu.stanford.nlp.semgraph.semgrex.CoordinationPattern.matcher(CoordinationPattern.java:120)
at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.resetChild(NodePattern.java:356)
at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.goToNextNodeMatch(NodePattern.java:455)
at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.matches(NodePattern.java:572)
at edu.stanford.nlp.semgraph.semgrex.SemgrexMatcher.find(SemgrexMatcher.java:193)
at edu.stanford.nlp.trees.UniversalEnglishGrammaticalStructure.processComplex2WP(UniversalEnglishGrammaticalStructure.java:1604)
at edu.stanford.nlp.trees.UniversalEnglishGrammaticalStructure.processMultiwordPreps(UniversalEnglishGrammaticalStructure.java:1541)
at edu.stanford.nlp.trees.UniversalEnglishGrammaticalStructure.addEnhancements(UniversalEnglishGrammaticalStructure.java:915)
at edu.stanford.nlp.trees.UniversalEnglishGrammaticalStructure.addEnhancements(UniversalEnglishGrammaticalStructure.java:986)
at edu.stanford.nlp.trees.UniversalEnglishGrammaticalStructure.collapseDependencies(UniversalEnglishGrammaticalStructure.java:1042)
at edu.stanford.nlp.trees.GrammaticalStructure.typedDependenciesCCprocessed(GrammaticalStructure.java:895)
at edu.stanford.nlp.semgraph.SemanticGraphFactory.makeFromTree(SemanticGraphFactory.java:258)
at edu.stanford.nlp.semgraph.SemanticGraphFactory.generateCCProcessedDependencies(SemanticGraphFactory.java:163)
at edu.stanford.nlp.pipeline.ParserAnnotatorUtils.fillInParseAnnotations(ParserAnnotatorUtils.java:65)
at edu.stanford.nlp.pipeline.ParserAnnotator.finishSentence(ParserAnnotator.java:309)
at edu.stanford.nlp.pipeline.ParserAnnotator.doOneSentence(ParserAnnotator.java:275)
at edu.stanford.nlp.pipeline.SentenceAnnotator.annotate(SentenceAnnotator.java:102)
at edu.stanford.nlp.pipeline.AnnotationPipeline.annotate(AnnotationPipeline.java:76)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.annotate(StanfordCoreNLP.java:655)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
When I try to parse the same text again, it works. Is this caused by multi-threading? Thanks.
Are you able to give us a bit more of an example? I've looked through the various code paths, and there's one block which is potentially using various multithreading operations. However, we've tried to secure it with locks. What system are you using? Also, what was the original text - that might give some more insight into the particular error which was triggered. It seems pretty suspicious that it's looking for a word which isn't even present in the graph.
On Sun, Dec 5, 2021 at 5:09 PM d0ngw @.***> wrote:
When I parse text with CoreNLP 4.3.0 in a concurrently threaded task, the parser throws an exception:
java.util.concurrent.ExecutionException: edu.stanford.nlp.semgraph.UnknownVertexException: Operation attempted on unknown vertex felt/VBD in graph -> was/VBD (root) -> all/DT (advmod) -> After/IN (case) -> ,/, (punct) -> Dana/NNP (nsubj) -> there/RB (advmod) -> right/RB (advmod) -> next/JJ (advmod) -> me/PRP (obl) -> to/IN (case) -> ./. (punct)
Caused by: edu.stanford.nlp.semgraph.UnknownVertexException: null at edu.stanford.nlp.semgraph.SemanticGraph.parentPairs(SemanticGraph.java:730) at edu.stanford.nlp.semgraph.semgrex.GraphRelation$DEPENDENT$1.advance(GraphRelation.java:325) at edu.stanford.nlp.semgraph.semgrex.GraphRelation$SearchNodeIterator.initialize(GraphRelation.java:1103) at edu.stanford.nlp.semgraph.semgrex.GraphRelation$SearchNodeIterator.
(GraphRelation.java:1084) at edu.stanford.nlp.semgraph.semgrex.GraphRelation$DEPENDENT$1. (GraphRelation.java:310) at edu.stanford.nlp.semgraph.semgrex.GraphRelation$DEPENDENT.searchNodeIterator(GraphRelation.java:310) at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.resetChildIter(NodePattern.java:337) at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher. (NodePattern.java:332) at edu.stanford.nlp.semgraph.semgrex.NodePattern.matcher(NodePattern.java:293) at edu.stanford.nlp.semgraph.semgrex.CoordinationPattern$CoordinationMatcher. (CoordinationPattern.java:146) at edu.stanford.nlp.semgraph.semgrex.CoordinationPattern.matcher(CoordinationPattern.java:120) at edu.stanford.nlp.semgraph.semgrex.CoordinationPattern$CoordinationMatcher. (CoordinationPattern.java:146) at edu.stanford.nlp.semgraph.semgrex.CoordinationPattern.matcher(CoordinationPattern.java:120) at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.resetChild(NodePattern.java:356) at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.goToNextNodeMatch(NodePattern.java:455) at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.matches(NodePattern.java:572) at edu.stanford.nlp.semgraph.semgrex.SemgrexMatcher.find(SemgrexMatcher.java:193) at edu.stanford.nlp.trees.UniversalEnglishGrammaticalStructure.processComplex2WP(UniversalEnglishGrammaticalStructure.java:1604) at edu.stanford.nlp.trees.UniversalEnglishGrammaticalStructure.processMultiwordPreps(UniversalEnglishGrammaticalStructure.java:1541) at edu.stanford.nlp.trees.UniversalEnglishGrammaticalStructure.addEnhancements(UniversalEnglishGrammaticalStructure.java:915) at edu.stanford.nlp.trees.UniversalEnglishGrammaticalStructure.addEnhancements(UniversalEnglishGrammaticalStructure.java:986) at edu.stanford.nlp.trees.UniversalEnglishGrammaticalStructure.collapseDependencies(UniversalEnglishGrammaticalStructure.java:1042) at edu.stanford.nlp.trees.GrammaticalStructure.typedDependenciesCCprocessed(GrammaticalStructure.java:895) at edu.stanford.nlp.semgraph.SemanticGraphFactory.makeFromTree(SemanticGraphFactory.java:258) at edu.stanford.nlp.semgraph.SemanticGraphFactory.generateCCProcessedDependencies(SemanticGraphFactory.java:163) at edu.stanford.nlp.pipeline.ParserAnnotatorUtils.fillInParseAnnotations(ParserAnnotatorUtils.java:65) at edu.stanford.nlp.pipeline.ParserAnnotator.finishSentence(ParserAnnotator.java:309) at edu.stanford.nlp.pipeline.ParserAnnotator.doOneSentence(ParserAnnotator.java:275) at edu.stanford.nlp.pipeline.SentenceAnnotator.annotate(SentenceAnnotator.java:102) at edu.stanford.nlp.pipeline.AnnotationPipeline.annotate(AnnotationPipeline.java:76) at edu.stanford.nlp.pipeline.StanfordCoreNLP.annotate(StanfordCoreNLP.java:655) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) When I try to parse the same text again, it works. Is this caused by multi-threading? Thanks.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stanfordnlp/CoreNLP/issues/1229, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2AYWIZAMBESSS5YMJN3TLUPQELFANCNFSM5JNOCAMQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Thanks for you reply, our system is Ubuntu 18.04.5.
The original text I will send it to your email, please check it.
All I see is one sentence. I meant I could use some mechanism to reproduce the error. Have you seen it happen more than once?
On Mon, Dec 6, 2021 at 9:54 PM d0ngw @.***> wrote:
Thanks for you reply, our system is Ubuntu 18.04.5. The original text I will send it to your email, please check it.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/stanfordnlp/CoreNLP/issues/1229#issuecomment-987591942, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2AYWJOR3AUPPGNB6VXHXTUPWOQZANCNFSM5JNOCAMQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Yes, it has happened more than once. We started a parser server to parse the text, and it happens about once every 2-3 days.
Maybe fixed here?
https://nlp.stanford.edu/software/stanford-corenlp-4.5.0b.zip
Sorry for my late reply, we have switched to Spacy.
I'll try the new version, thanks.
Ah, sorry to hear that. On account of the bug in this issue, or is there some other motivating factor? It would be good to know what we can do better.
Except the parser, we still use CoreNLP for some functionality.
We use Spacy's parser because it's more accurate in our tests and our AI scientists are more familiar with python and Pytorch.
Makes sense, and thanks for the followup. The Spacy constituency parser is substantially more accurate than the CoreNLP parser. FWIW, Stanza (our Python software) has a parser which should be on par with Spacy's. Unfortunately, I don't expect that parser to become part of Java CoreNLP any time soon, although I won't rule anything out
Thank you for your work.
We love the ssplit annotator, which we use as a preprocessor for Spacy's input. 😄
#1296 seems fixed, and this should be the same issue.