CoreNLP
CoreNLP copied to clipboard
Missing depedency in case of conjugation
Description
When we take a simple sentence like fermented leaves and fruit
CoreNLP misses the dependency between fermented
and fruit
. Detecting this is crucial for our application and therefore we wonder whether this is possible?
There has been a recent publication where they show being able to capture such dependencies (Figure 1). However, since this competition was very recent (this month) we can not find the corresponding scripts.
Reproduce
with CoreNLPClient( annotators=['tokenize','ssplit','pos','depparse']) as client:
ann = client.annotate('fermented leaves and fruit')
sentence = ann.sentence[0]
print(sentence.enhancedPlusPlusDependencies)
print(sentence.alternativeDependencies)
print(sentence.basicDependencies)
print(sentence.enhancedDependencies)
FWIW this is not likely to be fixed with the current model. Possibly with a targeted dataset the stanza model could support it
The problem is that this structure is systematically ambiguous as to whether the adjective does or doesn't scope over the conjunction, and CoreNLP doesn't currently have a good way to tell, and so we regard the conservative choice as not adding the arc. Working out when it does scope broadly is a perfectly good research area but we see no likely future in which it is added to CoreNLP, unless someone else is providing a solution.