DeepHyperX Parallelize data loading

Parallelize data loading

Open nshaud opened this issue 3 years ago • 0 comments

Currently, the torch DataLoader uses blocking data loading. Although loading is very fast (we store the NumPy arrays in-memory), transfer to GPU and data augmentation (which is done on CPU) can slow things done.

Using workers > 0 would make data loading asynchronous and workers > 1 could increase speed somewhat.

TODO:

[ ] Benchmark speed gain using asynchronous data loading
[x] Implement asynchronous data loading for all DataLoader objects
[x] Add a user-input option to define the number of jobs

Nov 25 '20 16:11 nshaud

DeepHyperX DeepHyperX copied to clipboard

Parallelize data loading

DeepHyperX
DeepHyperX copied to clipboard