Astariul comments

Results 46 comments of


                                            Astariul

BART Pretraining Script

I'm also very interested in pretraining script. Any update ? @ngoyal2707 @yinhanliu

The question about shuffle

Data shuffle is random. Randomness is guided by the seed. By default, seed is always `666`. You can modify it by using the option `-seed`, for example `-seed 42` So...

Gold and candidate files are written together here : https://github.com/nlpyang/PreSumm/blob/ce8dc017fbef7c12b1b4bd764f0c3d20911ead5e/src/models/predictor.py#L175-L177 So it should not be possible to have wrong order, unless you modified the code.

The question about shuffle

I think that's why 3 files are written : `gold`, `candidate`, and `source`. These files are the same order. So if you want to access the article, just refer to...

Subtraction, the `-` operator, with a bool tensor is not supported.

It comes from your PyTorch version. https://github.com/nlpyang/PreSumm/issues/2 Downgrade to `1.1.0`

the candidate results of all the samples are the same

For extractive summarization, the author trained the model on 3 GPU. For abstractive summarization, the author trained the model on 4 GPU for 2 days.

the candidate results of all the samples are the same

@robinsongh381 We are interested ! So you replaced beam search and got better results ?

the candidate results of all the samples are the same

Thanks for the message ! Do you remember (approximately) how big is the difference in ROUGE score ?

Using Model for Inference

Try with same command but replace : `-bert_data_path ../bert_data/` by `-bert_data_path ../bert_data/cnndm`

Using Model for Inference

How did you preprocess the data ? Did you do it by yourself or use the already processed data ?