stanza
stanza copied to clipboard
Dependency parsing: is there some way to discourage multiple `nsubj` dependents?
A very weird English tree produced by Stanza 1.6.0 in the demo:
My cousin my extremely rude colleague admired last year chewed the chicken enthusiastically.
In UD, no word is allowed to have multiple (plain) nsubj dependents. But "admired" has two.
Is there a recommended alternate training or decoding method in Stanza that could avoid this sort of problem?
Currently no, the parser has no idea how to handle such a weird sentence and there's no way to give it constraints such as only one nsubj. Clearly it is interpreting the appositive phrase as the root phrase of the sentence. If you create a few sentences of similar nature, we can throw them in the training data and see if it improves these structures without hurting the rest of the performance
https://github.com/stanfordnlp/handparsed-treebank
On Mon, Jan 29, 2024 at 8:00 PM Nathan Schneider @.***> wrote:
A very weird English tree produced by Stanza 1.6.0 in the demo http://stanza.run/:
My cousin my extremely rude colleague admired last year chewed the chicken enthusiastically.
image.png (view on web) https://github.com/stanfordnlp/stanza/assets/985263/a6468f67-9d09-4ad5-8f3e-4cbc92d08e6b
In UD, no word is allowed to have multiple (plain) nsubj dependents. But "admired" has two.
Is there a recommended alternate training or decoding method in Stanza that could avoid this sort of problem?
— Reply to this email directly, view it on GitHub https://github.com/stanfordnlp/stanza/issues/1340, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2AYWKUOH2ULB5YKLI3MNDYRBV43AVCNFSM6AAAAABCQRFCNOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGEYDMOJQGQ2DIMI . You are receiving this because you are subscribed to this thread.Message ID: @.***>