David Dale comments

Results 74 comments of


                                            David Dale

Is it possible to add another language?

Hi! This question has already been addressed in https://github.com/facebookresearch/seamless_communication/issues/109 https://github.com/facebookresearch/seamless_communication/issues/32, and the short answer "it's complicated". For adding new languages to text translation models, there are some existing pointers, including:...

Silent Parts of the Audio

Hi! One potential solution would be the following: 1. Detect the silence and voice in the source audio using some external voice activity detection model. 2. Split the source audio...

Source language detection

The Seamless project did not release a speech language identification model. However, you can use a speech LID model from a related project called MMS: https://github.com/facebookresearch/fairseq/blob/main/examples/mms/README.md#lid. In https://github.com/facebookresearch/seamless_communication/issues/325 I give...

Load model weight local

@ndlongvn If you are using the HuggingFace version of the weights, then you can refer to [the documentation of the HF re-implementation of Seamless ](https://huggingface.co/docs/transformers/v4.36.1/en/model_doc/seamless_m4t#usage)for the inference recipes. You will...

When use speech to text inference, how to keep the src_lang same as tgt_lang

No, spoken language identification is not integrated in Seamless models. If you want it, you'll have to apply some external LID model, such as MMS. I provide more details in...

Translation result is not good.

Hi @asasas234! Could you please provide some more details? * The exact model and code that you used to generate the translation; * The translation that you got as a...

Can the speech to speech translation preserve the original speaker's voice?

The [SeamlessExpressive](https://github.com/facebookresearch/seamless_communication/blob/main/docs/expressive/README.md) model released in November does preserve many properties of the source voice. I guess, it fulfills what this issue is asking about.

Add support for Sinhala language

The current Seamless family supports Sinhala only with the SeamlessM4T-Medium model, and only in the text modality (see https://github.com/facebookresearch/seamless_communication/tree/main/docs/m4t for more details, and model cards https://github.com/facebookresearch/seamless_communication/tree/main/src/seamless_communication/cards for the lists of...

demo app.py

It is the same issue as https://github.com/facebookresearch/seamless_communication/issues/174. The solution is to translate one sentence at a time, because this is how the model has been trained.

entire sentences are dropped in T2TT

Actually, both Seamless and its text-only predecessor NLLB were trained mostly on single-sentence translation, and they are by no means guaranteed to correctly translate multiple-sentence texts. Thus, the safest recommendation...