lhotse icon indicating copy to clipboard operation
lhotse copied to clipboard

Utility for prefetching data on GPUs

Open pzelasko opened this issue 3 years ago • 5 comments

My understanding is that with PyTorch's DataLoader, we do the following

asynchronously:

  • I/O reads
  • transforms
  • collation

synchronously:

  • transfer of batches from the worker process to the training process (upon calling __next__ in the training loop)
  • batch transfer from CPU to GPU (upon calling tensor.to(device))

I think we can add a class that wraps the DataLoader (FastDataLoader?) and performs both transfer in the background to further speed up the training. There seems to be a good example here: https://github.com/NVIDIA/apex/blob/master/examples/imagenet/main_amp.py#L265

Another good one (but doesn't address the CPU -> GPU transfer) is in torchaudio: https://github.com/pytorch/audio/blob/master/torchaudio/datasets/utils.py#L276

pzelasko avatar Mar 23 '21 15:03 pzelasko

... unless I'm missing something - to me it seems like this sort of functionality should be a standard pytorch component?

pzelasko avatar Mar 23 '21 15:03 pzelasko

I suppose so... I don't know whether that transfer is really going to be a limiting factor, but if it doesn't make the code significantly harder to understand I suppose it's a nice thing to have.

On Tue, Mar 23, 2021 at 11:49 PM Piotr Żelasko @.***> wrote:

... unless I'm missing something - to me it seems like this sort of functionality should be a standard pytorch component?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lhotse-speech/lhotse/issues/243#issuecomment-805014712, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLO55BI7JCOYX2VB2EV3TFC2B7ANCNFSM4ZVOMNOQ .

danpovey avatar Mar 23 '21 16:03 danpovey

If I remember correctly, DataLoader in PyTorch does not support GPU when num_workers is greater than one.

csukuangfj avatar Mar 23 '21 16:03 csukuangfj

@pzelasko Is there a way to store wavefoem in hdf5 format, then we can load data fast and do on-the-fly augmentation?

fanlu avatar Jun 01 '21 23:06 fanlu

Yeah, it is definitely doable. I don’t know how much faster it would be, but I guess it won’t require to open/close new files all the time.

As a “quick” workaround you can try increasing the number of dataloader workers.

pzelasko avatar Jun 01 '21 23:06 pzelasko