Shruti Mittal

Results 19 comments of Shruti Mittal

Hey, sorry was travelling last 2 days. Attaching cprofile on google colab. This is after setting `cache_on_load = True` and `preload_wav=True` in dataset.py file; `num_workers=1` ![Screenshot from 2020-03-13 14-26-46](https://user-images.githubusercontent.com/5301961/76605676-d8504500-6536-11ea-837e-3817bbab1231.png) l...

Hi @pswietojanski Cpu usage looks ~100%. Any comments here? this is htop output on GCP - for 1st epoch, using `num_workers=16`; `P100`; `cache_on_load=True`; `preload_wav=True` ![Screenshot from 2020-03-13 16-45-27](https://user-images.githubusercontent.com/5301961/76616466-17d45c80-654a-11ea-9451-d93e167482c6.png) this is...

I did `os.environ["OMP_NUM_THREADS"] = "1"`, `num_workers = 4` doesn't reduce the time much (20-30min max)

Hey i am getting better speed now, was using lower no. of CPU cores and `K80` machine earlier. with `num_workers = 8` and `V100` time to train 1 epoch is...

with `num_workers = 16` `P100` setting `os.environ["OMP_NUM_THREADS"] = "1"` the epoch trains in ~90mins `preload_wav = True` `cache_on_load = True` however preloading is not improving speed for epoch 2 onwards...

why is caching not increasing the speed? - setting `preload_wav = True` and `cache_on_load=True` in train.py. Could cpu to gpu data transfer time be a bottleneck?

Hey, sorry this was long back. Dont remember the details now. I pretty much followed the ReadMe and the training scripts to understand the data pre-processing pipeline.

Hey did you segment your data? I think i got a similar error when I didnt

No, check the script at `/data/prep/prepare_segmented_dataset_libri.py`