dc_tts icon indicating copy to clipboard operation
dc_tts copied to clipboard

Error occur when runnning "python train.py 2"

Open tkgmomosheep opened this issue 5 years ago • 9 comments

I have ran "train.py 1 " to >400k.

When I run "train.py 2", I ran in to OOM. Below is the log from the command. https://pastebin.com/3hgn7rnd

I am using a 2080 ti and have 64gb ram. I am using python3.7, cuda 10.0 and tensorflow 1.14.0

tkgmomosheep avatar Oct 11 '19 12:10 tkgmomosheep

try python2

ssnake avatar Oct 13 '19 17:10 ssnake

Thanks for the reply, I'll try later on.

Btw I looked at the loss in teosorboard and it's not what a expected. Any idea what might be wrong?

2019-10-14 01_42_02-TensorBoard

tkgmomosheep avatar Oct 13 '19 17:10 tkgmomosheep

My variant. The model has been taught on russian dataset (common voice) image

ssnake avatar Oct 17 '19 06:10 ssnake

@tkgmomosheep have you overcome your issue? I run into it too, but It happened after I replaced huge dataset with particular dataset with 180 files in it

ssnake avatar Oct 31 '19 09:10 ssnake

@tkgmomosheep have you overcome your issue? I run into it too, but It happened after I replaced huge dataset with particular dataset with 180 files in it

My dataset only have 13x files in it. I have to lower thr batch size from 32 to 16 to get it to work. I don't know if there is any workaround except from lowering batch size.

tkgmomosheep avatar Oct 31 '19 09:10 tkgmomosheep

yes, it helped. Reduced batch size from 32 to 16 and it started to work

ssnake avatar Oct 31 '19 09:10 ssnake

How do you lower the batch size?

Traincraft101 avatar Dec 15 '19 17:12 Traincraft101

look at hyperparams.py at the end of it just set new value B = 16

ssnake avatar Dec 15 '19 19:12 ssnake

Yeah, lowering the batch size worked for me too.

Pritzier avatar Sep 11 '20 19:09 Pritzier