Thuy Tran comments

Results 5 comments of


                                            Thuy Tran

stitching audio samples to generate diverse positive dataset

> @ljj7975 What do you mean about second filtering?

Result getting worse when i use ground truth duration.

@leminhnguyen Hello, I'm interested in your experiment in Vietnamese task. Have you ever compared the synthesized audio quality made by VITS to FastSpeech2? If yes, which one do you think...

Result getting worse when i use ground truth duration.

@leminhnguyen Thanks. Mis-puntutation in ur case means bad duration or tone issues? Btw, can I ask you more personally in private email or other chat platforms?

My synthetic voice is bad with sampling_rate 16k model in data Aishell3

> > 光改参数没用的，代码里也有。原作者的preprocessor.py 172行要改成wav, _ = librosa.load(wav_path, self.sampling_rate)，因为没参数的话还是默认22050 read的 > > 天哪，非常感谢。这个地方还真没注意到。我立马调整一下。还有一个问题，想要问一下大神。我用标贝数据训练了模型，训练epoch约150。模型合成出来的语音，前半段还可以听。后半段完全没法听。不知道大神能不能指点一下。下面是我的语音： [哈尔滨今天晴，十度到二十二度，南风三级，空气质量良。.zip](https://github.com/ming024/FastSpeech2/files/7646156/default.zip) > > 语谱图在这： ![哈尔滨今天晴，十度到二十二度，南风三级，空气质量良。](https://user-images.githubusercontent.com/27938135/144529571-98927cfe-b2ed-4ad3-a5d0-ea346b142182.png) 非常期望大神指点一二 @Tian14267 Hi, I faced the same problem with yours. My audio...

WeNet 3.0 Roadmap

Do you have plan to implement Text to speech models?