tensorflow-wavenet icon indicating copy to clipboard operation
tensorflow-wavenet copied to clipboard

Only noise be generated

Open b03201003 opened this issue 8 years ago • 16 comments

Hi, I used the code directly, and I got something just like noise. I executed "python train.py" , and then "python generate.py". Finally I got that file in zip.

Thanks! generated.wav.zip

b03201003 avatar Jun 14 '17 15:06 b03201003

same problem, looking for help. my generated.wav only have few seconds and noise. this maybe helpful

#218

0x-stan avatar Jun 15 '17 13:06 0x-stan

I get this on my Terminal screen. I am not sure if I am running it properly or I shouldn't see this warning. Warning: onespeaker/Audio/p225/p225_265.wav was ignored as it contains only silence. Consider decreasing trim_silence threshold, or adjust volume of the audio. How would I fix it so I don't get this warning?

aelbialy-tbox avatar Jun 15 '17 17:06 aelbialy-tbox

How many steps? I takes me about 80000+ before things sound okay. Also use a seed wav file. I seem to get better results.

devinroth avatar Jun 17 '17 07:06 devinroth

Change the trim_silence threshold or adjust the audio volume to get rid of the warning.

devinroth avatar Jun 17 '17 07:06 devinroth

@devinroth hi, can i have some advise from you, i do the 99999 steps , but could not hear the clear sound.

XiaoSX avatar Jun 21 '17 03:06 XiaoSX

@XiaoSX post an audio file.

devinroth avatar Jun 21 '17 04:06 devinroth

@devinroth ok, here is the result file. generated2_wav.zip

XiaoSX avatar Jun 21 '17 05:06 XiaoSX

@XiaoSX use a seed wav file and try again.

devinroth avatar Jun 21 '17 05:06 devinroth

@devinroth what is seed wav file? I followed the original tutorial from: https://github.com/ibab/tensorflow-wavenet, and the only "factor" I changed is SILENCE_THRESHOLD from 0.3 to 0. The corpus I utilized is from VCTK. :)

ChrisChan2013 avatar Dec 25 '17 10:12 ChrisChan2013

@devinroth hey Devin, what loss do you get with 80k+ iterations? Can you post the data, including trim threshold and model config you're using?

rafaelvalle avatar Dec 25 '17 14:12 rafaelvalle

Take a snippet of an audio file to jump start the generation. Ideally use something that wasn't used while training.

Devin Roth

On Dec 25, 2017, at 2:46 AM, ChrisChan2013 [email protected] wrote:

@devinroth what is seed wav file? I followed the original tutorial from: https://github.com/ibab/tensorflow-wavenet, and the only "factor" I changed is SILENCE_THRESHOLD from 0.3 to 0. The corpus I utilized is from VCTK. :)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

devinroth avatar Dec 25 '17 21:12 devinroth

I believe i got around 3% loss. More than that ended up being over trained.

I'll post some stuff if I get a chance. Probably not for a few weeks though.

Devin Roth

On Dec 25, 2017, at 6:32 AM, Rafael Valle [email protected] wrote:

@devinroth hey Devin, what loss do you get with 80k+ iterations? Can you post the data, including trim threshold and model config you're using?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

devinroth avatar Dec 25 '17 21:12 devinroth

What do you mean by 3% loss? Loss should be between 0 and + infinity. Did you train on a single vctk speaker or all speakers?

rafaelvalle avatar Dec 25 '17 21:12 rafaelvalle

hi @devinroth I'm also getting the same white noise generated when following your readme.

Sorry but I don't understand what you mean by a seeded wav file? do you mean use a previously generated wav file and put it back into the training as a parameter? Of course that will be me putting a noisey wav file back in.

Could you please give some examples?

Is there any chance that someone can update the readme.md to reflect how to simply use this to generate a working wav file because I followed your readme steps and this is what came out.

TheDevelolper avatar Sep 14 '18 10:09 TheDevelolper

So I found the parameter in the generate script. I tried the following:

python generate.py --wav_out_path=generated.wav --wav_seed corpus/VCTK-Corpus/wav48/p252/p252_016.wav --samples 16000 logdir/train/2018-09-14T09-27-10/model.ckpt-138

Now I can almost head a human voice but then load of static after.

TheDevelolper avatar Sep 14 '18 10:09 TheDevelolper

The seed is an clip of audio like something you trained with. Not generated. Like a clip of someone talking.

On Sep 14, 2018, at 3:00 AM, KiransHub [email protected] wrote:

hi @devinroth I'm also getting the same white noise generated when following your readme.

Sorry but I don't understand what you mean by a seeded wav file? do you mean use a previously generated wav file and put it back into the training as a parameter? Of course that will be me putting a noisey wav file back in.

Could you please give some examples?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

devinroth avatar Sep 14 '18 13:09 devinroth