seamless_communication
seamless_communication copied to clipboard
Is it possible to add another language?
Good afternoon, is it possible to add another language by myself? Is there a section that will help me to finalize the model?
Hi! This question has already been addressed in https://github.com/facebookresearch/seamless_communication/issues/109 https://github.com/facebookresearch/seamless_communication/issues/32, and the short answer "it's complicated".
For adding new languages to text translation models, there are some existing pointers, including:
- Efficiently Upgrading Multilingual Machine Translation Models to Support More Languages by Sun et al, 2023. This paper does rigorous experimentations on some tricks for adding new languages to multilingual text translation models, without sacrificing the performance of existing languages.
- How to fine-tune a NLLB-200 model for translating a new language, a hands-on tutorial by myself on fine-tuning NLLB with a single pair of languages, one of which is new.
For speech-to-text and speech-to-speech translation, though, there are no comprehensive guides on adding new languages yet.