seamless_communication
seamless_communication copied to clipboard
Why such a simple example is wrong?
# T2ST
input_text = "how do you do"
src_lang = "eng"
tgt_lang = "eng"
path_to_save_audio = "./test.wav"
translated_text, wav, sr = translator.predict(input_text, "t2st", tgt_lang, src_lang=src_lang, ngram_filtering=True)
# print(wav.shape)
torchaudio.save(path_to_save_audio, wav[0].to(torch.float32).cpu(), sample_rate=sr)
you'll have to be more specific, what is wrong?
"how do you do" , The audio generated by this sentence is incorrect, It sounds like “how do you one”