fairseq Generating with MBART50 not working

Generating with MBART50 not working

Open MathieuGrosso opened this issue 1 year ago • 0 comments

🐛 Bug

Hello i have downloaded the many to many mbart50 and i want to test it in en-fr with data from wmt. It did not work and I keep having the same word generated instead of a good translation. Do you know why? Are the model not pretrained ? Maybe i have not understand it right.

Here is a file showing what I get:

To Reproduce

What i did:

First downloaded with sacrebleu dataset wmt en-fr
Python /path/to/fairseq/examples/multilingual/data_scripts/binarize.py

Export path_2_data=$work_dir/databin Export model=$work_dir/model.pt Export langs="ar_AR,....,sl_SI" Export source_lang="en_XX" Export target_lang="fr_XX"

Fairseq-generate $path_2_data
--path $model
--task translation_from_pretrained_bart
--gen-subset test \ -s en_XX -t fr_XX \ --sacrebleu --remove-bpe 'sentencepiece' \ --batch-size 32 \ --encoder-langtok "src" \ --decoder-langtok \ --langs $langs

Environment

fairseq Version (1.0.0 ):
PyTorch Version (1.11)
How you installed fairseq (pip, source): from zip + pip install --editable .
Python version: 3.8
CUDA/cuDNN version: 11.X
GPU models and configuration: tesla v100

Jul 05 '22 15:07 MathieuGrosso

fairseq fairseq copied to clipboard

Generating with MBART50 not working

🐛 Bug

To Reproduce

Environment

fairseq
fairseq copied to clipboard