seamless_communication
seamless_communication copied to clipboard
When use speech to text inference, how to keep the src_lang same as tgt_lang
Many real world speed may include two or more language, like the people who speak Japanese, may some words have to use English. when we do transcribe, we like to keep the original text. how to do that?
even with asr, we still need put src_lang,
ASR
This is equivalent to S2TT with <tgt_lang>=<src_lang>
.
transcribed_text, _, _ = translator.predict(<path_to_input_audio>, "asr", <src_lang>)