LPCTron icon indicating copy to clipboard operation
LPCTron copied to clipboard

Tacotron samplerate 22050 does not match LPCNet 16000

Open superhg2012 opened this issue 5 years ago • 15 comments

1 Do you use same training dataset to train both Tacotron and LPCNet?

I see that you copy genarated f32 files (16000) to audio directory which is generated by T2 preprocess

And your T2 preprocess used samplerate 22050 to generate audio, mel and linear.

Does this matter?

superhg2012 avatar May 24 '19 09:05 superhg2012

No Need to Train LPCNet , you can use existing model. But i guess if we train with same dataset it should work better. Regarding sample rate i guess it should 16k , you might be right.

alokprasad avatar May 24 '19 09:05 alokprasad

I am working on mandarin synthesis, so I need to train LPCNet from scratch.

superhg2012 avatar May 24 '19 09:05 superhg2012

Yes in that case you need to have same dataset . You need a single PCM files containing the audio samples for training LPCNET

alokprasad avatar May 24 '19 09:05 alokprasad

So, Tacotron2 still predict mel spectrogram as condition for LPCNet? In LPCNet paper, the 20 dim features is not mel spectrogram. How it works?

superhg2012 avatar May 24 '19 09:05 superhg2012

@lyz04551 feature can easily be generated from Speech , so we dont need feature generated from Tacotron2 .

alokprasad avatar Aug 02 '19 10:08 alokprasad

I think we should change the hop_size and n_fft if we use 16k audio since we may need to predict linear spectrum: linear_spec the left is original tacotron, right is this project's .

lmingde avatar Aug 23 '19 01:08 lmingde

I think we should change the hop_size and n_fft if we use 16k audio since we may need to predict linear spectrum: linear_spec the left is original tacotron, right is this project's .

How about the audio quality of the lpctron you use?

lyz04551 avatar Aug 23 '19 01:08 lyz04551

I got bad quailty using 10000 samples for traning with LPCTron. what about you?

superhg2012 avatar Aug 26 '19 07:08 superhg2012

I got bad quailty using 10000 samples for traning with LPCTron. what about you?

The quality of the speech I synthesized is not very good, and the background always has some harsh sounds.

lyz04551 avatar Aug 26 '19 07:08 lyz04551

I got bad quailty using 10000 samples for traning with LPCTron. what about you?

The quality of the speech I synthesized is not very good, and the background always has some harsh sounds.

can you give me some samples?

ysujiang avatar May 14 '20 07:05 ysujiang

Hey, could anyone share a recipe for increasing the sample rate?

a-froghyar avatar Jul 22 '20 09:07 a-froghyar

@a-froghyar use sox

alokprasad avatar Jul 23 '20 15:07 alokprasad

Hey @alokprasad I meant regarding the training and synthesis within the repo - I'd like to train Tacotron-2 and LPCNet with 24kHz samples.

a-froghyar avatar Aug 11 '20 12:08 a-froghyar

I got bad quailty using 10000 samples for traning with LPCTron. what about you?

sorry, I want to train tacotron2+lpcnet.And I do not understand how to train LPCNet? Using the predict features by tacotron2 or the raw features?

JunenuJ avatar Sep 06 '20 15:09 JunenuJ

@

Can you tell me the version for loctron?

chengshaodi avatar Jan 15 '22 03:01 chengshaodi