15755841658

Results 6 issues of 15755841658

(tensorflow) wtx@wtx-Vostro-230:~/tacotron$ python preprocess_zh.py --dataset ljspeech 0%| | 0/13100 [00:00

在运行python3 demo_server.py --checkpoint ~/tacotron/logs-tacotron/model.ckpt-185000出现 ![Screenshot from 2020-01-10 10-34-21](https://user-images.githubusercontent.com/45281733/72122796-2f506700-339a-11ea-88a5-bdf4a89637e2.png) 然后我运行python3 eval.py --checkpoint ~/tacotron/logs-tacotron/model.ckpt-185000 --reference_audio='test.wav' 出现 ![Screenshot from 2020-01-10 11-16-18](https://user-images.githubusercontent.com/45281733/72122958-a8e85500-339a-11ea-8555-f973c33c2058.png)

Exception in thread Thread-1: Traceback (most recent call last): File "D:\Anaconda3\envs\wtx\lib\threading.py", line 916, in _bootstrap_inner self.run() File "D:\Anaconda3\envs\wtx\lib\threading.py", line 864, in run self._target(*self._args, **self._kwargs) File "D:\tf-wavenet_vocoder-master\apps\vocoder\datasets\data_feeder.py", line 145, in thread_main...

> > @FanhuaandLuomu 输入为拼音的声母、韵母序列; 之前由于担心插入blank,会使输入序列变成2倍长度,导致工程实现中耗时变长,从而影响首包延时以及RTF。现在补上blank,没有出现发音问题了,加上blank后首包延迟为100ms,整体rtf为0.03的样子,还好。 > >@hermanseu 是的,我之前也是加了blank 改善了吞音现象。你是用gpu 推理的吗?还是改成了流式 @FanhuaandLuomu 1. 其实还是没搞明白吞音的根本原因在哪里。有检查过预测的隐变量的均值方差,均值特别小,方差看着有点大;只用均值作为隐变量的话,依然会有吞音的问题,想着是不是loss控制的不好导致均值学的有点偏,后面忙其他的事情就没继续做实验了。有思路的话,能否分享下~~ 2. 我这有逻辑会把训练得到的模型dump成二进制,然后c/c++读取二进制模型,在cpu上推理,流式输出。 _Originally posted by @hermanseu in https://github.com/jaywalnut310/vits/issues/2#issuecomment-1396587418_

![image](https://user-images.githubusercontent.com/45281733/224249027-a54d1abd-47bc-48f7-b75c-c92a4212da95.png) if os.path.exists(wav_path) and '_mic2.flac' in wav_path is error? "and" change to "or"? And there is a problem with my data set?

![image](https://user-images.githubusercontent.com/45281733/95047298-6761bc00-0718-11eb-81ee-d01a03cccfbb.png)