S2S-AMR-Parser icon indicating copy to clipboard operation
S2S-AMR-Parser copied to clipboard

Can only get 80.8 on AMR 2.0

Open headacheboy opened this issue 4 years ago • 5 comments

Hi

When parsing on AMR 2.0 with the model PTM-MT(WMT14B)-SemPar(WMT14M), we can only get 80.8 instead of 81.4 as you mentioned in your paper.

We're wondering whether there is anything important when preprocessing and postprocessing on AMR 2.0? Could you provide more details about it?

Thank you!

headacheboy avatar Oct 21 '20 04:10 headacheboy

Hello,

We did not do any additional operations in pre-processing ~~and post-processing~~. Notice that:

  1. We use the source sentence in "amrs" in AMR2.0 instead of that in "alignment". The implementation of Tokenization in alignment is more complicated. Here we use AllenNLP to handle the Tokenization.
  2. Tokenization and BPE are required for source sentences.
  3. The sentences used in post-processing have not been processed by Tokenization and BPE. See here, we use "sent" rather than "sent.tok" or "sent.tok.bpe".
python2 postprocess_AMRs.py -f sent.amr -s sent

Good Luck! xdqkid

xdqkid avatar Oct 22 '20 08:10 xdqkid

Hi,

When postprocessing, I use the latest version of RikVN/AMR, and the api to get wiki label has been changed. I'm wondering whether these decrease the performance of model...

headacheboy avatar Oct 23 '20 14:10 headacheboy

Hi, Thank you for your reminder! When we do post-processing, we do some processing on the wiki. We store the <name, wiki> dictionary in the training set and prefer to use this dictionary rather than default wiki get_wiki_from_spotlight_by_name. This may have some influence on the result.

I'm sorry that it has been a long time since I modified the wiki, and I almost forgot my modification .

BTW,

  1. We also try to do some wiki used in amr_2.0_utils of STOG. It improve 3.2 point in Wiki and less than 0.1 point in final Smatch F1. We did not adopt this method in the end. So I think wiki may not be the main reason.
  2. Delete sent.amr.*, like sent.amr.pruned.wiki.coref all, and Do Post-Processing again. It seems post-processing also cause float.

xdqkid avatar Oct 23 '20 16:10 xdqkid

Hi,

I get 81.0 now. But I am unable to get the best result 81.4...

Could you provide your modification of post-processing and the <name, wiki> dictionary on AMR2.0?

Thank you!

headacheboy avatar Oct 24 '20 13:10 headacheboy

Hi,

Thanks for your advice. I'm too busy recently, e.g. looking for a job, preparing for dissertation, building websites for CCMT2020&AACL2020. I will probably find a time to push an update that contains full codes later this or next year .

Cheers

xdqkid avatar Oct 24 '20 14:10 xdqkid