MetricGAN icon indicating copy to clipboard operation
MetricGAN copied to clipboard

~loss: Nan

Open LexiYIN opened this issue 3 years ago • 3 comments

Hi,Jason:

When I try to train the MetricGAN(table2).py, I met two problems: (1)At every epoch both on G and D, its loss is nan , but only show a warning :

/student/home/yll/anaconda3/envs/MetricGAN/lib/python3.6/site-packages/keras/engine/training.py:973: UserWarning: Discrepancy between trainable weights and collected trainable weights, did you set `model.trainable` without calling `model.compile` after ?
  'Discrepancy between trainable weights and collected trainable'
Epoch 1/1
 - 1498s - loss: nan

(2) At the fisrt epoch of traning Discriminator, it taken 593M on GPU's memory but its ultilization is 0%.

Data info: I use the same dataset from SEGAN already downsampled to 16k by the way, keras-gpu=2.1.2, tensorflow-gpu=1.10, librosa=0.5.1, python=3.6

Do you know why this happen(loss nan)? Thank you very much.

LexiYIN avatar Mar 16 '21 13:03 LexiYIN

HI,

I never met this nan problem before. I'm not sure whether it is due to a different python version (mine is Python 2.7) or PESQ code. Maybe you can also try to use TargetMetric='stoi' to test the code.

JasonSWFu avatar Mar 17 '21 06:03 JasonSWFu

Hi,Could you tell me the version of your tensorflow ,please.

LexiYIN avatar Mar 17 '21 10:03 LexiYIN

Hi, it is 1.4.0

JasonSWFu avatar Mar 30 '21 06:03 JasonSWFu