PreSumm icon indicating copy to clipboard operation
PreSumm copied to clipboard

the candidate results of all the samples are the same

Open myjxm opened this issue 5 years ago • 17 comments

hello! first thanks for your contribution. when I try to test test the BertSumAbs .the cmd is :

python train.py -task abs -mode test -batch_size 30 -test_batch_size 5 -bert_data_path ../bert_data_cnndm_final/cnndm -log_file ../logs/val_abs_bert_cnndm_eng -model_path ../models/abs_trans_eng/ -sep_optim true -use_interval true -visible_gpus 0 -max_pos 512 -max_length 200 -alpha 0.95 -min_length 50 -result_path ../results/abs_bert_cnndm_eng/ -test_from ../models/abs_trans_eng/model_step_200000.pt

i got the wrong candidate file like this : for example : PreSumm-master/results/abs_bert_cnndm_eng/.200000.candidate: new : new : : : new york 's the u.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s new : new : : : new york 's the u.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s new : new : : : new york 's the u.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s new : new : : : new york 's the u.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s new : new : : : new york 's the u.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s new : new : : : new york 's the u.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s new : new : : : new york 's the u.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s new : new : : : new york 's the u.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s new : new : : : new york 's the u.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s new : new : : : new york 's the u.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s new : new : : : new york 's the u.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s new : new : : : new york 's the u.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s new : new : : : new york 's the u.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s new : new : : : new york 's the u.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s.s

It seems like all the test samples got the same predication result and then the result is so bed. but the file both ".200000.gold" and ".200000.raw_src" are correct is there anything wrong ?

myjxm avatar Sep 30 '19 03:09 myjxm

mine also getting the same results. All output are same

rajeshsahu09 avatar Oct 15 '19 13:10 rajeshsahu09

me too.

chen1234yue avatar Oct 15 '19 14:10 chen1234yue

Please paste your training commands here

nlpyang avatar Oct 16 '19 05:10 nlpyang

python3 train.py -task abs -mode train -bert_data_path bert_data/ -dec_dropout 0.2 -model_path model_abs/ -sep_optim true -lr_bert 0.002 -lr_dec 0.2 -save_checkpoint_steps 2000 -batch_size 140 -train_steps 200000 -report_every 50 -accum_count 5 -use_bert_emb true -use_interval true -warmup_steps_bert 20000 -warmup_steps_dec 10000 -max_pos 512 -visible_gpus 0 -log_file abs_bert_cnndm

rajeshsahu09 avatar Oct 16 '19 05:10 rajeshsahu09

With only 1 gpu for training, you need to accumulate the gradient for a much larger step, or the model cannot be trained effectively.

nlpyang avatar Oct 16 '19 05:10 nlpyang

SIr, But I have only one gpu. Can't the training be effective on that.

rajeshsahu09 avatar Oct 16 '19 06:10 rajeshsahu09

You can use our Trained Models.

nlpyang avatar Oct 16 '19 06:10 nlpyang

With only 1 gpu for training, you need to accumulate the gradient for a much larger step, or the model cannot be trained effectively.

so you need to have how many gpu for training??

cuthbertjohnkarawa avatar Nov 11 '19 07:11 cuthbertjohnkarawa

For extractive summarization, the author trained the model on 3 GPU.

For abstractive summarization, the author trained the model on 4 GPU for 2 days.

astariul avatar Nov 18 '19 04:11 astariul

I have faced the same repeating issues when training korean model. After some amount of research, I have found that this is a general problem for natural language generation and known as degeneration.

I have added an extra module for the decoder, replacing beam search.

Please let me know if anyone is interested in it

Thanks

robinsongh381 avatar Nov 27 '19 05:11 robinsongh381

@robinsongh381 We are interested !

So you replaced beam search and got better results ?

astariul avatar Nov 27 '19 05:11 astariul

@Colanim Sorry for late reply I have replace the decoder with the following method which proposes a new way of sampling tokens at decoding steps, rather than just depending on beam search.

The paper has suggested two methods and they are implemented on here

From my experience, I could avoid the repetition issue with the proposed method and hence improve the ROUGE score !

Hope my opinion helps

robinsongh381 avatar Dec 02 '19 05:12 robinsongh381

Thanks for the message !

Do you remember (approximately) how big is the difference in ROUGE score ?

astariul avatar Dec 02 '19 08:12 astariul

@Colanim Sorry for late reply I have replace the decoder with the following method which proposes a new way of sampling tokens at decoding steps, rather than just depending on beam search.

The paper has suggested two methods and they are implemented on here

From my experience, I could avoid the repetition issue with the proposed method and hence improve the ROUGE score !

Hope my opinion helps

can share your results ??

cuthbertjohnkarawa avatar Dec 12 '19 01:12 cuthbertjohnkarawa

Hi. I am using the pre-trained models for testing on CNN dataset. This is the command I am giving: python train.py -task abs -mode test -test_from ~/Downloads/cnndm_baseline_best.pt -batch_size 3000 -test_batch_size 500 -bert_data_path ../bert_data/test -log_file ../logs/val_abs_bert_cnndm -sep_optim true -use_interval true -visible_gpus 0 -max_pos 512 -max_length 200 -alpha 0.95 -min_length 50 -result_path ../logs/abs_bert_cnndm

But my candidate result is same as the very first single file i used. After that it isn't changing. What am i doing wrong?

Shanzaay avatar Jan 24 '20 20:01 Shanzaay

while training on 1 gpu we need to set the grad accum count greater than 5, u said. how much should that be? Please help.

With only 1 gpu for training, you need to accumulate the gradient for a much larger step, or the model cannot be trained effectively.

Rumi4 avatar Jan 26 '22 09:01 Rumi4

python3 train.py -task abs -mode train -bert_data_path bert_data/ -dec_dropout 0.2 -model_path model_abs/ -sep_optim true -lr_bert 0.002 -lr_dec 0.2 -save_checkpoint_steps 2000 -batch_size 140 -train_steps 200000 -report_every 50 -accum_count 5 -use_bert_emb true -use_interval true -warmup_steps_bert 20000 -warmup_steps_dec 10000 -max_pos 512 -visible_gpus 0 -log_file abs_bert_cnndm

Using a single gpu generation of sentence results, do you solve the problem?

wbchief avatar Jun 05 '22 03:06 wbchief