seamless_communication
seamless_communication copied to clipboard
Source language detection
How to get source language detection. This should be similar to detect_language function in whisper Reason: When using a chatbot I want to automatically detect the source language and provide the final answer in the source language directly.
@Vaibhavs10 , Is there anyway to do this? thanks!
Would also be interested.
This is a valuable Whisper feature that is really missing from m4t. The value of the model is much lower if we can't know what was the original language.
Please make this available.
The Seamless project did not release a speech language identification model. However, you can use a speech LID model from a related project called MMS: https://github.com/facebookresearch/fairseq/blob/main/examples/mms/README.md#lid.
In https://github.com/facebookresearch/seamless_communication/issues/325 I give some more details.