LPCTron icon indicating copy to clipboard operation
LPCTron copied to clipboard

The audio which sythesize by feature file(.f32) is bad ?

Open lmingde opened this issue 5 years ago • 9 comments

I extract feature file(.f32) by modify /Tacotron-2/preprocessor.py(link)

# I change the preprocessor.py line 133:
    feature_name = 'feature-{}.f32'.format(wavfile)
    mel_filename = 'mel-{}.npy'.format(wavfile)
    linear_filename = 'linear-{}.npy'.format(wavfile)

    mel_spectrogram = mel_spectrogram.T
    np.save(os.path.join(mel_dir, mel_filename), mel_spectrogram, allow_pickle=False)
    mel_spectrogram = mel_spectrogram.reshape((-1,))
    mel_spectrogram.tofile(os.path.join(feature_dir,feature_name))
    np.save(os.path.join(linear_dir, linear_filename), linear_spectrogram.T, allow_pickle=False)

and I use the commands:

make test_lpcnet taco=1 # Define TACOTRON2 macro
./test_lpcnet test_features.f32 test.s16
ffmpeg -f s16le -ar 16k -ac 1 -i test.s16 test-out.wav

But the audio is bad: lpctron

lmingde avatar Aug 20 '19 09:08 lmingde

I met same issue, have you solved it? tacotron2 predicted features for lpcnet is not accurate.

superhg2012 avatar Aug 21 '19 11:08 superhg2012

I met same issue, have you solved it? tacotron2 predicted features for lpcnet is not accurate.

NO, I just use the feature by dump_data extract from audio, I will try to train T2 to predict the feature(f.32), Are you use the feature (f.32) to synthesize?

lmingde avatar Aug 21 '19 13:08 lmingde

另外,我觉得这个LPCNet需要更新,我用这个LPCNet 合成 dump_data(taco 状态下)抽取的特征效果特别差,但是用最新版本的LPCNet效果就非常好。

lmingde avatar Aug 21 '19 13:08 lmingde

@lmingde 您好,有些问题冒昧请教一下,我用自己的数据集训练了LPCNET和tacotron2,语音合成时语音质量特别差,语音不清晰部分失真,训练LPCNET时,epoch=1和epoch=120的loss差很小,请问您LPCNet训练的时候是否有类似的发生,您最终模型的损失大概是多少?另外您说的最新版本的LPCNet是用的那个版本的,方便给个网址吗

ysujiang avatar May 22 '20 06:05 ysujiang

@ysujiang 最新的在LPCnet论文中给的链接上.

lmingde avatar May 25 '20 01:05 lmingde

@lmingde 您可以给我发个链接吗?或者能发一下论文的题目吗?谢谢

ysujiang avatar Jun 02 '20 11:06 ysujiang

@lmingde hello,您训练得模型有颤音吗? 目前我训练的模型生成的语音样本有颤音,您是否遇到过这样的问题?您对taco和lpcnet有做改动吗

ysujiang avatar Jul 06 '20 06:07 ysujiang

@lmingde 您好,有些问题冒昧请教一下,我用自己的数据集训练了LPCNET和tacotron2,语音合成时语音质量特别差,语音不清晰部分失真,训练LPCNET时,epoch=1和epoch=120的loss差很小,请问您LPCNet训练的时候是否有类似的发生,您最终模型的损失大概是多少?另外您说的最新版本的LPCNet是用的那个版本的,方便给个网址吗 你好,我想请问,你是单独训练LPCNet和tacotron2吗,我对此还不是很清楚,希望给个明确的指导,谢谢

JunenuJ avatar Sep 06 '20 15:09 JunenuJ

@lmingde hello,您训练得模型有颤音吗? 目前我训练的模型生成的语音样本有颤音,您是否遇到过这样的问题?您对taco和lpcnet有做改动吗

你好,我看代码中,用GL合成语音,你用GL合成了嘛?我合成的效果很差,还想请教一下,你是如何用LPCNet合成的? 使用原始的特征?还是tacotron2 预测的特征呢?

JunenuJ avatar Sep 08 '20 07:09 JunenuJ