CoreNLP icon indicating copy to clipboard operation
CoreNLP copied to clipboard

Missing depedency in case of conjugation

Open rickbeeloo opened this issue 4 years ago • 2 comments

Description When we take a simple sentence like fermented leaves and fruit CoreNLP misses the dependency between fermented and fruit. Detecting this is crucial for our application and therefore we wonder whether this is possible?

There has been a recent publication where they show being able to capture such dependencies (Figure 1). However, since this competition was very recent (this month) we can not find the corresponding scripts.

Reproduce

with CoreNLPClient( annotators=['tokenize','ssplit','pos','depparse']) as client:
    ann = client.annotate('fermented leaves and fruit')
    sentence = ann.sentence[0]
    print(sentence.enhancedPlusPlusDependencies)
    print(sentence.alternativeDependencies)
    print(sentence.basicDependencies)
    print(sentence.enhancedDependencies)

rickbeeloo avatar Jul 23 '20 14:07 rickbeeloo

FWIW this is not likely to be fixed with the current model. Possibly with a targeted dataset the stanza model could support it

AngledLuffa avatar Feb 16 '22 01:02 AngledLuffa

The problem is that this structure is systematically ambiguous as to whether the adjective does or doesn't scope over the conjunction, and CoreNLP doesn't currently have a good way to tell, and so we regard the conservative choice as not adding the arc. Working out when it does scope broadly is a perfectly good research area but we see no likely future in which it is added to CoreNLP, unless someone else is providing a solution.

manning avatar Feb 22 '22 18:02 manning