Eraser-Benchmark-Baseline-Models icon indicating copy to clipboard operation
Eraser-Benchmark-Baseline-Models copied to clipboard

The results of BERT-LSTM model is different from the paper

Open aiishii opened this issue 5 years ago • 2 comments

Hello. Let me ask you a question.

I tried to build a BERT-LSTM model for your paper using Movie Reviews data, but I couldn't reproduce the paper results. My results are as follows. Training: train 0.925, validation 0.833, test 0.849 prediction: Performance AURPRC comprehensiveness sufficiency BERT-LSTM + Attention 0.829 0.463 0.223 0.141 BERT-LSTM + Simple Gradient 0.829 0.469 0.222 0.141 The performance in Table 4 of the paper is 0.974, and my result is 0.829, which is very different.

What I changed from the parameters listed in the README is that the predict batch size is 4 to 2 due to lack of memory. My environment is as follows: Memory 65G, GPU NVIDIA Tesla 32GB

Could you tell me if there are any parameter differences or any other differences from the paper experiments?

aiishii avatar Nov 26 '19 08:11 aiishii

Hi, I tried to build bert_encoder_generator using Movie Reviews data, but I met with some issues. The training processes are normal, but the results on the validation data are always the same with fscore_NEG: 0.000 fscore_POS: 0.667. I try different bert learning rate with 5e-1, 5e-2, 5e-3, 5e-4, 5e-5. However, the results on the validation dataset are the same. Could you show me how to set parameters?

xmshi-trio avatar Jan 06 '21 07:01 xmshi-trio

Hi, the Bert encoder generator model is extremely unstable hence it is not surprising that you are getting bad results. Could you try with word_emb_encoder_generator model ? Also try setting reinforce_loss_weight to 0 here https://github.com/successar/Eraser-Benchmark-Baseline-Models/blob/894bfba09e8966aec9b046ddc595d434504a4f90/Rationale_model/training_config/classifiers/bert_encoder_generator.jsonnet#L99 and see if you still get the same problem.

successar avatar Jan 07 '21 21:01 successar