transition-amr-parser
transition-amr-parser copied to clipboard
parser doesn't produce amr-unknown
I was able to train the parser as per your instructions. But when testing the trained model I found that it didn't produce amr-unknown node. For example:
Text: Which architect of Marine Corps Air Station Kaneohe Bay was also tenant of New Sanno hotel?
# ::node 1 person 1-2
# ::node 2 architect-01 1-2
# ::node 3 facility 3-9
# ::node 5 also 10-11
# ::node 6 reside-01 11-12
# ::node 7 company 13-16
# ::node 10 name 3-9
# ::node 11 "Marine" 3-9
# ::node 12 "Corps" 3-9
# ::node 13 "Air" 3-9
# ::node 14 "Station" 3-9
# ::node 15 "Kaneohe" 3-9
# ::node 16 "Bay" 3-9
# ::node 18 name 13-16
# ::node 19 "New" 13-16
# ::node 20 "Sanno" 13-16
# ::node 21 "Hotel" 13-16
# ::root 6 reside-01
# ::edge person ARG0-of architect-01 1 2
# ::edge architect-01 ARG1 facility 2 3
# ::edge reside-01 mod also 6 5
# ::edge reside-01 ARG0 person 6 1
# ::edge reside-01 ARG1 company 6 7
# ::edge facility name name 3 10
# ::edge name op1 "Marine" 10 11
# ::edge name op2 "Corps" 10 12
# ::edge name op3 "Air" 10 13
# ::edge name op4 "Station" 10 14
# ::edge name op5 "Kaneohe" 10 15
# ::edge name op6 "Bay" 10 16
# ::edge company name name 7 18
# ::edge name op1 "New" 18 19
# ::edge name op2 "Sanno" 18 20
# ::edge name op3 "Hotel" 18 21
# ::short {1: 'p', 2: 'a', 3: 'f', 5: 'a2', 6: 'r', 7: 'c', 10: 'n', 11: 'x0', 12: 'x1', 13: 'x2', 14: 'x3', 15: 'x4', 16: 'x5', 18: 'n2', 19: 'x6', 20: 'x7', 21: 'x8'}
(r / reside-01
:ARG0 (p / person
:ARG0-of (a / architect-01
:ARG1 (f / facility
:name (n / name
:op1 "Marine"
:op2 "Corps"
:op3 "Air"
:op4 "Station"
:op5 "Kaneohe"
:op6 "Bay"))))
:ARG1 (c / company
:name (n2 / name
:op1 "New"
:op2 "Sanno"
:op3 "Hotel"))
:mod (a2 / also))
parsing the same sentence with amrlib parser, for example, gives me this result with amr-unknown:
# ::snt Which architect of Marine Corps Air Station Kaneohe Bay was also tenant of New Sanno hotel?
(t / tenant-01
:ARG0 (a / amr-unknown
:ARG0-of (a2 / architect-01
:ARG1 (f / facility
:name (n / name
:op1 "Marine"
:op2 "Corps"
:op3 "Air"
:op4 "Station"
:op5 "Kaneohe"
:op6 "Bay"))))
:ARG1 (h / hotel
:name (n2 / name
:op1 "New"
:op2 "Sanno"))
:mod (a3 / also))
It should produce amr-unknown, we use this often for question parsing.
What did you trained it with? I just checked on a v0.4.2 deploy and it parses correctly. Also, do you tokenize?
hi @ramon-astudillo, well, I was trying to follow your setup instructions from here for setup and training (the default action-pointer network config bash run/run_experiment.sh configs/amr2.0-action-pointer.sh
). This is the code for inference:
from transition_amr_parser.parse import AMRParser
amr_parser_checkpoint = "/DATA/AMR2.0/models/exp_cofill_o8.3_act-states_RoBERTa-large-top24/_act-pos-grh_vmask1_shiftpos1_ptr-lay6-h1_grh-lay123-h2-allprev_1in1out_cam-layall-h2-abuf/ep120-seed42/checkpoint_best.pt"
parser = AMRParser.from_checkpoint(amr_parser_checkpoint)
words = [word.strip(string.punctuation) for word in text.split()]
annotations = parser.parse_sentences([words])
would mind sharing your trained checkpoint to see if it makes any difference?
would mind sharing your trained checkpoint to see if it makes any difference?
I am certain it should. We are looking into sharing pre-trained models but I can not say anything at this point.
Also FYI we will update to v0.5.1
soon (post EMNLP preprint submission deadline). This new model (Structured-BART) is new SoTA for AMR2.0 and will be published at EMNLP2021, a non updated prerprint is here https://openreview.net/forum?id=qjDQCHLXCNj
From experience in parsing questions, I can say silver-data fine-tuning works well. You can parse some text corpus with questions, filter it with a couple of rules*, and the use it as additional training data. The training scheme silver+gold pre-training with gold fine-tuning seems to work best, see e.g. https://aclanthology.org/2020.findings-emnlp.288/
(*) For example ignore all parses having :rel
(which indicates a detached subgraph) or with missing amr-unknown
(if you are certain it should have one).