fairseq
fairseq copied to clipboard
Translation Quality of NLLB-200(Dense, 3.3B) model is worse as compared to all other models for Japanese & English. Can anyone suggest why?
Hi,
I have used all the different NLLB models for Japanese to English, and English to Japanese translations. I have observed that the translation quality of NLLB-200(Dense, 3.3B) is very bad when compared to all other models. I wanted to know the reason for this. Can Can someone suggest a reason for this?
Which models have you compared NLLB-200 3.3B dense with?
I have compared NLLB-200 3.3B dense with NLLB-200 54.5 B MoE, NLLB-200 1.3B dense and NLLB-200-Distilled 600M dense models.
--
@suraj143rosy Are you comparing NLLB (multilingual model) to a only japanese-english model?
While this is not an absolute reason, it definetly impacts: https://github.com/facebookresearch/fairseq/issues/4560 So I believe a lot of characters are unk to the model.
No, I am comparing these NLLB models with each other.