tacotron icon indicating copy to clipboard operation
tacotron copied to clipboard

Inefficient RAM usage?

Open chief7 opened this issue 7 years ago • 4 comments

Hey there! First of all: thanks for the amazing work!

I've hit the problem that my train.py process gets killed by the Linux kernels OOM killer. My question is: Has someone experienced the same? I guess there's some kind of inefficient RAM usage (I suspect the normalization computations, though I'm just starting to look into it).

I used a different data set but the rest of the code is exactly the same.

Thanks in advance.

EDIT: Forgot to mention that this happens though I'm running on gpu!

chief7 avatar Jun 10 '17 05:06 chief7

Update on this one: I did a little research and it seems that the audio features in data_load.py are calculated over and over again.

As a first hack I had a script calculating all the wav's features and saving them to some files using pickle. This brings down CPU load from around 600%-700% on my 8-core down to around 150%.

Anyway - this doesn't fix RAM consumption.

chief7 avatar Jun 11 '17 09:06 chief7

@chief7 I used a different data set too, and i find the low GPU usage. Do you solve this problem?

zuoxiang95 avatar Jul 13 '17 08:07 zuoxiang95

@zuoxiang95 As my dataset is very small (~12 GB of features) I managed to put it all to RAM and that improved training speed as well as GPU usage

EDIT: I wrote a small script to do some preprocessing on the files and save all features to numpy files. That seems to do the trick as well.

chief7 avatar Jul 14 '17 15:07 chief7

Related to GPU memory usage, if you have a GPU with lots of ram it is useful to add the following to train.py if you want to be able to run eval.py without stopping the run from time to time (although it will be slow)

I have 11GB on a 1080ti and so 60% seems to work well for me and my batch rate stuck around ~4000/h.

I will try to commit back some code if I produce anything useful but I am benchmarking some tensorflow bazel options right now.

115a116,117
>         config.gpu_options.allow_growth = True
>         config.gpu_options.per_process_gpu_memory_fraction = 0.6
118c120
<         with sv.managed_session() as sess:
---
>         with sv.managed_session(config=config) as sess:

gdahlm avatar Jul 18 '17 16:07 gdahlm