Ossian
Ossian copied to clipboard
Training takes up too much disk space
Hi Oliver,
When I do training on a 3.5 hour corpus, I run out of disk space (12Gigs) very fast:
OSSIAN$ du -h train/chv/speakers/news/naive_01_nn -d 1
26M train/chv/speakers/news/naive_01_nn/time_lab
888K train/chv/speakers/news/naive_01_nn/dnn_training_ACOUST
9.7G train/chv/speakers/news/naive_01_nn/cmp
129M train/chv/speakers/news/naive_01_nn/lab_dur
8.7M train/chv/speakers/news/naive_01_nn/align_lab
8.5M train/chv/speakers/news/naive_01_nn/dur
64M train/chv/speakers/news/naive_01_nn/utt
253M train/chv/speakers/news/naive_01_nn/processors
12M train/chv/speakers/news/naive_01_nn/align_log
629M train/chv/speakers/news/naive_01_nn/lab_dnn
11G train/chv/speakers/news/naive_01_nn
I see most of the space is taken up under the cmp
directory:
OSSIAN$ du -h train/chv/speakers/news/naive_01_nn/cmp -d 1
4.0K train/chv/speakers/news/naive_01_nn/cmp/nn_mgc_lf0_vuv_bap_199
4.0K train/chv/speakers/news/naive_01_nn/cmp/nn_norm_mgc_lf0_vuv_bap_199
4.4G train/chv/speakers/news/naive_01_nn/cmp/binary_label_502
2.9G train/chv/speakers/news/naive_01_nn/cmp/nn_no_silence_lab_502
4.0K train/chv/speakers/news/naive_01_nn/cmp/nn_no_silence_lab_norm_502
9.7G train/chv/speakers/news/naive_01_nn/cmp
So the binary_label_502
and nn_no_silence_lab_502
take up the most space under cmp
.
Any work arounds?
I'm running Ossian
on AWS with 16G disk space, and since the OS takes up about 4G, training crashes after I train the frontend and move on to Merlin
.
Specifically, I crash after this command:
python ./tools/merlin/src/run_merlin.py /home/ubuntu/Ossian/train//chv/speakers/news/naive_01_nn/processors/acoustic_predictor/config.cfg
Thanks!
-josh