fairseq icon indicating copy to clipboard operation
fairseq copied to clipboard

Translation Quality of NLLB-200(Dense, 3.3B) model is worse as compared to all other models for Japanese & English. Can anyone suggest why?

Open suraj143rosy opened this issue 1 year ago • 4 comments

Hi,

I have used all the different NLLB models for Japanese to English, and English to Japanese translations. I have observed that the translation quality of NLLB-200(Dense, 3.3B) is very bad when compared to all other models. I wanted to know the reason for this. Can Can someone suggest a reason for this?

suraj143rosy avatar Aug 10 '22 10:08 suraj143rosy

Which models have you compared NLLB-200 3.3B dense with?

vedanuj avatar Aug 10 '22 15:08 vedanuj

I have compared NLLB-200 3.3B dense with NLLB-200 54.5 B MoE, NLLB-200 1.3B dense and NLLB-200-Distilled 600M dense models.


--


suraj143rosy avatar Aug 11 '22 04:08 suraj143rosy

@suraj143rosy Are you comparing NLLB (multilingual model) to a only japanese-english model?

While this is not an absolute reason, it definetly impacts: https://github.com/facebookresearch/fairseq/issues/4560 So I believe a lot of characters are unk to the model.

gmryu avatar Aug 11 '22 07:08 gmryu

No, I am comparing these NLLB models with each other.

suraj143rosy avatar Aug 11 '22 07:08 suraj143rosy