fairseq
fairseq copied to clipboard
mBART Continued Pretraining
Hello!
I am trying to perform continued pretraining on the mbart.cc.25 pretrained checkpoint using the multilingual denoising objective. However, I am not sure how to prepare and pre-process the data for the continued pretraining step. It would be great if someone could point me to what the pre-processing script should look like.
Thanks!
@ngoyal2707 and/or @myleott , any pointers would be really helpful!
Same question here. Pretraining code seems to be sorely missing. Any help would be great with that respect.
Has this been addressed since then? I'm dealing with the exact same problem here :(
Could you address this issue yet? @alimrsn79 @BramVanroy