dc_tts icon indicating copy to clipboard operation
dc_tts copied to clipboard

[SOLVED] It is not training

Open Jinex2012 opened this issue 6 years ago • 25 comments

This is pretty weird.

The graph Attention plot graph is also blank

alignment_009k

I restarted again, same issue 😕

The synthesised audio is blank. Each sentence produces an audio sample of 10 seconds of silence.

Jinex2012 avatar May 24 '18 10:05 Jinex2012

Yup, facing the same issue. I'm trying to train on a different speech dataset though and the attention plot is blank even after 200k steps. Any help would be appreciated.

bprabhakar avatar May 26 '18 07:05 bprabhakar

I think the issue might be with the Tensorflow version. @Jinex2012 can you tell me what TF version are you using? It's failing for me on v1.8.

bprabhakar avatar May 27 '18 23:05 bprabhakar

@bprabhakar I am using v1.5

Jinex2012 avatar May 28 '18 10:05 Jinex2012

Hi, I am also facing same issue.Did you guys found the issue.Any help would be appreciated.

NikhilReddy101995 avatar Jun 19 '18 06:06 NikhilReddy101995

Still haven't figure it out

Jinex2012 avatar Jun 19 '18 07:06 Jinex2012

Has anyone found a solution?

jemoal avatar Jun 25 '18 13:06 jemoal

@Jinex2012 and @bprabhakar ... are you running the code using tensorflow in CPU or GPU? ... I have realised that the author adds the queues to CPU (get_batch() function in 'load_data.py')... I do not know if this may be causing the issue... I have modified the code to put the queues into the GPU memory... but I get other errors that I am trying to solve

jemoal avatar Jun 25 '18 13:06 jemoal

So it was weird in my case. I was training using Tensorflow 1.8 (GPU) when I faced this issue. Basically when I tried printing the attention values that are being plotted here, they were becoming NaNs in a couple of steps. Very strangely, exactly the same piece of code worked perfectly fine when I switched to a different system with Tensorflow 1.7 (GPU). Perhaps some sorta numerical underflow/overflow?

bprabhakar avatar Jun 25 '18 14:06 bprabhakar

Thanks @bprabhakar .... I will try this....

jemoal avatar Jun 25 '18 16:06 jemoal

Hi again @bprabhakar ... Similar to you... the code has worked well using TF1.7 GPU. I think the problem is related to a specific kind of data used in the definition fon 'conv1D'.

Thanks.

Bye, @jemoal

jemoal avatar Jun 26 '18 07:06 jemoal

@bprabhakar thanks for noting. Indeed, I have 1.8 and it diverges for me whatever I do.

arogozhnikov avatar Jul 17 '18 16:07 arogozhnikov

it also works with tensorflow 1.9, but not tensorflow 1.8. I was not able to find something relevant in changelog

arogozhnikov avatar Jul 17 '18 16:07 arogozhnikov

Thanks for the update. Maybe @Kyubyong can add something in the README.md about telling people not to use tensorflow 1.8?

Jinex2012 avatar Jul 17 '18 16:07 Jinex2012

I guess we can close the issue?

Jinex2012 avatar Jul 17 '18 16:07 Jinex2012

@Jinex2012 no, I would be stuck for days if I haven't seen this thread

arogozhnikov avatar Jul 17 '18 16:07 arogozhnikov

ok, seems the problem in conv2d_transposed according to this thread https://github.com/tensorflow/tensorflow/issues/19200

arogozhnikov avatar Jul 17 '18 17:07 arogozhnikov

I updated Tensorflow from version 1.8 to version 1.9 but still i am getting blank attention graph Any help will be appreciated!

redoc700 avatar Aug 02 '18 06:08 redoc700

I tried versions 1.3 to 1.9, GPU and CPU, all got black (NAN-filled) attention graph. Any solution?

shamidreza avatar Oct 01 '18 20:10 shamidreza

Training for me is working using the latest version of tensorflow-gpu (1.12 as of November 2018) installed through conda (installing the latest versions via pip failed with core aborted or illegal instruction errors). I first uninstalled everything:

pip uninstall tensorflow
pip uninstall tensorboard
pip uninstall tensorflow-gpu

And installed afresh through conda as follows:

conda create -n tensorflow
conda install tensorflow-gpu -n tensorflow

wanshun123 avatar Nov 26 '18 09:11 wanshun123

I am trying to test with some 10-50 wav files (turkish text) and minimize the hidden units but I am failing to train on custom values (on Mac CPU). I tried tensorflow 1.8, 1.9, 1.12 (CPU) but I still get empty attention graph. When I check through synthesise passing some test texts graph values are all nan (noticed librosa giving errors on isftt). Anyone able to solve this problem (on CPU not GPU) ?

gorkemgoknar avatar Dec 10 '18 12:12 gorkemgoknar

@gorkemgoknar have you solved it? I am facing the same issue

DavidC001 avatar Mar 31 '20 15:03 DavidC001

I am trying to test with some 10-50 wav files (turkish text) and minimize the hidden units but I am failing to train on custom values (on Mac CPU). I tried tensorflow 1.8, 1.9, 1.12 (CPU) but I still get empty attention graph. When I check through synthesise passing some test texts graph values are all nan (noticed librosa giving errors on isftt). Anyone able to solve this problem (on CPU not GPU) ?

did u fix that err

queries01 avatar Jun 18 '20 07:06 queries01

I am using cpu for training but graph plot drawing empty blank? how can i resolve it? tf version 1.15.0 and very high cpu 128gb ram and 48 core

queries01 avatar Jun 18 '20 07:06 queries01

I am trying to test with some 10-50 wav files (turkish text) and minimize the hidden units but I am failing to train on custom values (on Mac CPU). I tried tensorflow 1.8, 1.9, 1.12 (CPU) but I still get empty attention graph. When I check through synthesise passing some test texts graph values are all nan (noticed librosa giving errors on isftt). Anyone able to solve this problem (on CPU not GPU) ?

did u fix that err

Nope I was not able to fix it. But using CPU for this is time wasting.I recomment you use Google Colab or Kaggle Kernels for trying this first (If you do not have GPU).

gorkemgoknar avatar Jun 18 '20 08:06 gorkemgoknar

I am trying to test with some 10-50 wav files (turkish text) and minimize the hidden units but I am failing to train on custom values (on Mac CPU). I tried tensorflow 1.8, 1.9, 1.12 (CPU) but I still get empty attention graph. When I check through synthesise passing some test texts graph values are all nan (noticed librosa giving errors on isftt). Anyone able to solve this problem (on CPU not GPU) ?

did u fix that err

Nope I was not able to fix it. But using CPU for this is time wasting.I recomment you use Google Colab or Kaggle Kernels for trying this first (If you do not have GPU).

colab and kaggle karnels are very slow working. Now I have physical server so I want to train on it :)

queries01 avatar Jun 18 '20 08:06 queries01