seamless_communication Why such a simple example is wrong?

Why such a simple example is wrong?

Open HLearning opened this issue 1 year ago • 2 comments

# T2ST
input_text = "how do you do"
src_lang = "eng"
tgt_lang = "eng"
path_to_save_audio = "./test.wav"

translated_text, wav, sr = translator.predict(input_text, "t2st", tgt_lang, src_lang=src_lang, ngram_filtering=True)
# print(wav.shape)

torchaudio.save(path_to_save_audio, wav[0].to(torch.float32).cpu(), sample_rate=sr)

Oct 25 '23 09:10 HLearning

you'll have to be more specific, what is wrong?

Nov 08 '23 09:11 Mortimerp9

"how do you do" ， The audio generated by this sentence is incorrect， It sounds like “how do you one”

Nov 09 '23 01:11 HLearning

seamless_communication seamless_communication copied to clipboard

Why such a simple example is wrong?

seamless_communication
seamless_communication copied to clipboard