Add support for Sinhala language
Hello,
I have a customer who wants support for Sinhala ASAP. Does Seamless_m4t support Sinhala? If not, can it be put on the roster as a TODO? What timelines am I looking at for demoing and using this? Kindly revert back to me.
Oh, also support for translation of Sinhala to Tamil and back. Thanks.
The current Seamless family supports Sinhala only with the SeamlessM4T-Medium model, and only in the text modality (see https://github.com/facebookresearch/seamless_communication/tree/main/docs/m4t for more details, and model cards https://github.com/facebookresearch/seamless_communication/tree/main/src/seamless_communication/cards for the lists of languages). Also, Sinhala is supported by the NLLB models.
Currently, there are not plans to extend SeamlessM4T to new languages. However, if you want to translate Sinhala speech, you could train your own Sinhala SONAR encoder and contribute it to https://github.com/facebookresearch/SONAR; if you make it compatible with the SONAR space, you will be able to use the existing SONAR text decoder to translate it into 200 languages.