fairseq icon indicating copy to clipboard operation
fairseq copied to clipboard

mBART Continued Pretraining

Open rohandas14 opened this issue 2 years ago • 4 comments

Hello!

I am trying to perform continued pretraining on the mbart.cc.25 pretrained checkpoint using the multilingual denoising objective. However, I am not sure how to prepare and pre-process the data for the continued pretraining step. It would be great if someone could point me to what the pre-processing script should look like.

Thanks!

rohandas14 avatar May 12 '22 21:05 rohandas14

@ngoyal2707 and/or @myleott , any pointers would be really helpful!

rohandas14 avatar May 12 '22 21:05 rohandas14

Same question here. Pretraining code seems to be sorely missing. Any help would be great with that respect.

BramVanroy avatar Sep 10 '22 19:09 BramVanroy

Has this been addressed since then? I'm dealing with the exact same problem here :(

alimrsn79 avatar Jan 15 '24 23:01 alimrsn79

Could you address this issue yet? @alimrsn79 @BramVanroy

tarudesu avatar Mar 06 '24 19:03 tarudesu