CoreNLP icon indicating copy to clipboard operation
CoreNLP copied to clipboard

edu.stanford.nlp.semgraph.UnknownVertexException

Open d0ngw opened this issue 4 years ago • 5 comments

When I parse text with CoreNLP 4.3.0 in a concurrently threaded task, the parser throws an exception:

java.util.concurrent.ExecutionException: edu.stanford.nlp.semgraph.UnknownVertexException: Operation attempted on unknown vertex felt/VBD in graph -> was/VBD (root)
  -> all/DT (advmod)
    -> After/IN (case)
  -> ,/, (punct)
  -> Dana/NNP (nsubj)
  -> there/RB (advmod)
    -> right/RB (advmod)
  -> next/JJ (advmod)
    -> me/PRP (obl)
      -> to/IN (case)
  -> ./. (punct)


Caused by: edu.stanford.nlp.semgraph.UnknownVertexException: null
        at edu.stanford.nlp.semgraph.SemanticGraph.parentPairs(SemanticGraph.java:730)
        at edu.stanford.nlp.semgraph.semgrex.GraphRelation$DEPENDENT$1.advance(GraphRelation.java:325)
        at edu.stanford.nlp.semgraph.semgrex.GraphRelation$SearchNodeIterator.initialize(GraphRelation.java:1103)
        at edu.stanford.nlp.semgraph.semgrex.GraphRelation$SearchNodeIterator.<init>(GraphRelation.java:1084)
        at edu.stanford.nlp.semgraph.semgrex.GraphRelation$DEPENDENT$1.<init>(GraphRelation.java:310)
        at edu.stanford.nlp.semgraph.semgrex.GraphRelation$DEPENDENT.searchNodeIterator(GraphRelation.java:310)
        at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.resetChildIter(NodePattern.java:337)
        at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.<init>(NodePattern.java:332)
        at edu.stanford.nlp.semgraph.semgrex.NodePattern.matcher(NodePattern.java:293)
        at edu.stanford.nlp.semgraph.semgrex.CoordinationPattern$CoordinationMatcher.<init>(CoordinationPattern.java:146)
        at edu.stanford.nlp.semgraph.semgrex.CoordinationPattern.matcher(CoordinationPattern.java:120)
        at edu.stanford.nlp.semgraph.semgrex.CoordinationPattern$CoordinationMatcher.<init>(CoordinationPattern.java:146)
        at edu.stanford.nlp.semgraph.semgrex.CoordinationPattern.matcher(CoordinationPattern.java:120)
        at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.resetChild(NodePattern.java:356)
        at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.goToNextNodeMatch(NodePattern.java:455)
        at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.matches(NodePattern.java:572)
        at edu.stanford.nlp.semgraph.semgrex.SemgrexMatcher.find(SemgrexMatcher.java:193)
        at edu.stanford.nlp.trees.UniversalEnglishGrammaticalStructure.processComplex2WP(UniversalEnglishGrammaticalStructure.java:1604)
        at edu.stanford.nlp.trees.UniversalEnglishGrammaticalStructure.processMultiwordPreps(UniversalEnglishGrammaticalStructure.java:1541)
        at edu.stanford.nlp.trees.UniversalEnglishGrammaticalStructure.addEnhancements(UniversalEnglishGrammaticalStructure.java:915)
        at edu.stanford.nlp.trees.UniversalEnglishGrammaticalStructure.addEnhancements(UniversalEnglishGrammaticalStructure.java:986)
        at edu.stanford.nlp.trees.UniversalEnglishGrammaticalStructure.collapseDependencies(UniversalEnglishGrammaticalStructure.java:1042)
        at edu.stanford.nlp.trees.GrammaticalStructure.typedDependenciesCCprocessed(GrammaticalStructure.java:895)
        at edu.stanford.nlp.semgraph.SemanticGraphFactory.makeFromTree(SemanticGraphFactory.java:258)
        at edu.stanford.nlp.semgraph.SemanticGraphFactory.generateCCProcessedDependencies(SemanticGraphFactory.java:163)
        at edu.stanford.nlp.pipeline.ParserAnnotatorUtils.fillInParseAnnotations(ParserAnnotatorUtils.java:65)
        at edu.stanford.nlp.pipeline.ParserAnnotator.finishSentence(ParserAnnotator.java:309)
        at edu.stanford.nlp.pipeline.ParserAnnotator.doOneSentence(ParserAnnotator.java:275)
        at edu.stanford.nlp.pipeline.SentenceAnnotator.annotate(SentenceAnnotator.java:102)
        at edu.stanford.nlp.pipeline.AnnotationPipeline.annotate(AnnotationPipeline.java:76)
        at edu.stanford.nlp.pipeline.StanfordCoreNLP.annotate(StanfordCoreNLP.java:655)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)

When I try to parse the same text again, it works. Is this caused by multi-threading? Thanks.

d0ngw avatar Dec 06 '21 01:12 d0ngw

Are you able to give us a bit more of an example? I've looked through the various code paths, and there's one block which is potentially using various multithreading operations. However, we've tried to secure it with locks. What system are you using? Also, what was the original text - that might give some more insight into the particular error which was triggered. It seems pretty suspicious that it's looking for a word which isn't even present in the graph.

On Sun, Dec 5, 2021 at 5:09 PM d0ngw @.***> wrote:

When I parse text with CoreNLP 4.3.0 in a concurrently threaded task, the parser throws an exception:

java.util.concurrent.ExecutionException: edu.stanford.nlp.semgraph.UnknownVertexException: Operation attempted on unknown vertex felt/VBD in graph -> was/VBD (root) -> all/DT (advmod) -> After/IN (case) -> ,/, (punct) -> Dana/NNP (nsubj) -> there/RB (advmod) -> right/RB (advmod) -> next/JJ (advmod) -> me/PRP (obl) -> to/IN (case) -> ./. (punct)

Caused by: edu.stanford.nlp.semgraph.UnknownVertexException: null at edu.stanford.nlp.semgraph.SemanticGraph.parentPairs(SemanticGraph.java:730) at edu.stanford.nlp.semgraph.semgrex.GraphRelation$DEPENDENT$1.advance(GraphRelation.java:325) at edu.stanford.nlp.semgraph.semgrex.GraphRelation$SearchNodeIterator.initialize(GraphRelation.java:1103) at edu.stanford.nlp.semgraph.semgrex.GraphRelation$SearchNodeIterator.(GraphRelation.java:1084) at edu.stanford.nlp.semgraph.semgrex.GraphRelation$DEPENDENT$1.(GraphRelation.java:310) at edu.stanford.nlp.semgraph.semgrex.GraphRelation$DEPENDENT.searchNodeIterator(GraphRelation.java:310) at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.resetChildIter(NodePattern.java:337) at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.(NodePattern.java:332) at edu.stanford.nlp.semgraph.semgrex.NodePattern.matcher(NodePattern.java:293) at edu.stanford.nlp.semgraph.semgrex.CoordinationPattern$CoordinationMatcher.(CoordinationPattern.java:146) at edu.stanford.nlp.semgraph.semgrex.CoordinationPattern.matcher(CoordinationPattern.java:120) at edu.stanford.nlp.semgraph.semgrex.CoordinationPattern$CoordinationMatcher.(CoordinationPattern.java:146) at edu.stanford.nlp.semgraph.semgrex.CoordinationPattern.matcher(CoordinationPattern.java:120) at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.resetChild(NodePattern.java:356) at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.goToNextNodeMatch(NodePattern.java:455) at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.matches(NodePattern.java:572) at edu.stanford.nlp.semgraph.semgrex.SemgrexMatcher.find(SemgrexMatcher.java:193) at edu.stanford.nlp.trees.UniversalEnglishGrammaticalStructure.processComplex2WP(UniversalEnglishGrammaticalStructure.java:1604) at edu.stanford.nlp.trees.UniversalEnglishGrammaticalStructure.processMultiwordPreps(UniversalEnglishGrammaticalStructure.java:1541) at edu.stanford.nlp.trees.UniversalEnglishGrammaticalStructure.addEnhancements(UniversalEnglishGrammaticalStructure.java:915) at edu.stanford.nlp.trees.UniversalEnglishGrammaticalStructure.addEnhancements(UniversalEnglishGrammaticalStructure.java:986) at edu.stanford.nlp.trees.UniversalEnglishGrammaticalStructure.collapseDependencies(UniversalEnglishGrammaticalStructure.java:1042) at edu.stanford.nlp.trees.GrammaticalStructure.typedDependenciesCCprocessed(GrammaticalStructure.java:895) at edu.stanford.nlp.semgraph.SemanticGraphFactory.makeFromTree(SemanticGraphFactory.java:258) at edu.stanford.nlp.semgraph.SemanticGraphFactory.generateCCProcessedDependencies(SemanticGraphFactory.java:163) at edu.stanford.nlp.pipeline.ParserAnnotatorUtils.fillInParseAnnotations(ParserAnnotatorUtils.java:65) at edu.stanford.nlp.pipeline.ParserAnnotator.finishSentence(ParserAnnotator.java:309) at edu.stanford.nlp.pipeline.ParserAnnotator.doOneSentence(ParserAnnotator.java:275) at edu.stanford.nlp.pipeline.SentenceAnnotator.annotate(SentenceAnnotator.java:102) at edu.stanford.nlp.pipeline.AnnotationPipeline.annotate(AnnotationPipeline.java:76) at edu.stanford.nlp.pipeline.StanfordCoreNLP.annotate(StanfordCoreNLP.java:655) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)

When I try to parse the same text again, it works. Is this caused by multi-threading? Thanks.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stanfordnlp/CoreNLP/issues/1229, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2AYWIZAMBESSS5YMJN3TLUPQELFANCNFSM5JNOCAMQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

AngledLuffa avatar Dec 07 '21 02:12 AngledLuffa

Thanks for you reply, our system is Ubuntu 18.04.5.
The original text I will send it to your email, please check it.

d0ngw avatar Dec 07 '21 05:12 d0ngw

All I see is one sentence. I meant I could use some mechanism to reproduce the error. Have you seen it happen more than once?

On Mon, Dec 6, 2021 at 9:54 PM d0ngw @.***> wrote:

Thanks for you reply, our system is Ubuntu 18.04.5. The original text I will send it to your email, please check it.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/stanfordnlp/CoreNLP/issues/1229#issuecomment-987591942, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2AYWJOR3AUPPGNB6VXHXTUPWOQZANCNFSM5JNOCAMQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

AngledLuffa avatar Dec 07 '21 05:12 AngledLuffa

Yes, it has happened more than once. We started a parser server to parse the text, and it happens about once every 2-3 days.

d0ngw avatar Dec 07 '21 06:12 d0ngw

Maybe fixed here?

https://nlp.stanford.edu/software/stanford-corenlp-4.5.0b.zip

AngledLuffa avatar Aug 13 '22 07:08 AngledLuffa

Sorry for my late reply, we have switched to Spacy.
I'll try the new version, thanks.

d0ngw avatar Oct 04 '22 04:10 d0ngw

Ah, sorry to hear that. On account of the bug in this issue, or is there some other motivating factor? It would be good to know what we can do better.

AngledLuffa avatar Oct 04 '22 04:10 AngledLuffa

Except the parser, we still use CoreNLP for some functionality.

We use Spacy's parser because it's more accurate in our tests and our AI scientists are more familiar with python and Pytorch.

d0ngw avatar Oct 04 '22 04:10 d0ngw

Makes sense, and thanks for the followup. The Spacy constituency parser is substantially more accurate than the CoreNLP parser. FWIW, Stanza (our Python software) has a parser which should be on par with Spacy's. Unfortunately, I don't expect that parser to become part of Java CoreNLP any time soon, although I won't rule anything out

AngledLuffa avatar Oct 04 '22 04:10 AngledLuffa

Thank you for your work.

We love the ssplit annotator, which we use as a preprocessor for Spacy's input. 😄

d0ngw avatar Oct 04 '22 06:10 d0ngw

#1296 seems fixed, and this should be the same issue.

AngledLuffa avatar Oct 21 '22 22:10 AngledLuffa