vision icon indicating copy to clipboard operation
vision copied to clipboard

DALI support

Open moskomule opened this issue 5 years ago • 13 comments

Hi, any plan to integrate DALI (https://docs.nvidia.com/deeplearning/sdk/dali-developer-guide/docs/index.html) to torchvision for faster preprocessing? I found chainer tries to integrate it (https://github.com/chainer/chainer/pull/5067).

moskomule avatar Sep 20 '18 02:09 moskomule

Hi, Thanks for opening the issue. I'll have a look at this

fmassa avatar Sep 21 '18 11:09 fmassa

Thank you. These days I found image preprocessing parts are the bottlenecks. I'll try DALI by myself and report how it will make the processing fast.

moskomule avatar Sep 22 '18 09:09 moskomule

albumentations is also a contender for faster image augmentation.

In my experience IO is actually worse than a "slow pre-processing" library. SSDs and NVMes(!) help a lot.

sotte avatar Oct 02 '18 16:10 sotte

Hi @datumbox it's been a while since this PR had any discussions, I'm curious if there are any plans to make this happen?

msaroufim avatar Apr 22 '22 16:04 msaroufim

@msaroufim we are currently working to improve the Data loading process using PyTorch Data. We do not have immediate plans for integrating DALI directly at the moment but we can review this on the future. As we have very little resources, I think it's more realistic that such an investigation can happen after the release of the new Datasets API.

ccing @NicolasHug and @pmeier who lead the work on datasets.

datumbox avatar Apr 25 '22 09:04 datumbox

Oh interesting so the way you'd integrate new backends in the future is to integrate them within torch.data? Also where can I learn more about the new Datasets API?

cc @VitalyFedyunin @ejguan @wenleix

msaroufim avatar Apr 25 '22 18:04 msaroufim

Oh interesting so the way you'd integrate new backends in the future is to integrate them within torch.data?

Not sure what you mean by "backends" here. In general you are right though. torchdata is the way to go for the new datasets.

Also where can I learn more about the new Datasets API?

There is no public document yet. However, we already have quite a large collection of datasets ported to the new structure. You can access them with torchvision.prototype.datasets.load(name), where name is the name of the dataset you want to load. For example

from torchvision.prototype import datasets

dataset = datasets.load("voc")

The dataset object is a regular IterDataPipe defined by torchdata. To transform it you can use the .map method. It takes a callable that will be executed for each sample in the dataset. This sample will be a dictionary with str keys. For example, a simple data pipeline could look like this:

from torchvision.prototype import transforms

transform = transforms.Compose(
    transforms.DecodeImage(),
    transforms.Resize(256),
    transforms.CenterCrop(256),
)

for sample in dataset.map(transform):
    ...

For everything else, please also have a look at the torchdata documentation.

pmeier avatar Apr 26 '22 06:04 pmeier

Adding to @pmeier's comment, this tutorial might help you.

abhi-glitchhg avatar Apr 26 '22 07:04 abhi-glitchhg

@pmeier to clarify by backend I mean one of these https://github.com/pytorch/vision#image-backend - i.e: pillow, accimage, pillow simd etc..

Overall the new interface for adding datasets looks good but I'm more curious about adding new backends like DALI. In particular DALI has some accelerated image processing kernels, accelerated image decoding which I think would be very useful to integrate in vision directly, feels too domain specific to be in torch.data IMHO and is similar enough to other backends like accimage to be in vision. What's the process like for adding a new backend? If it's similar to the one for accimage https://github.com/pytorch/vision/blob/main/torchvision/transforms/functional.py#L13 I can make a PR for this

The other option is to integrate the DALI data loader as a data pipe in torch.data

Here's a good primer on DALI and its value proposition https://cceyda.github.io/blog/dali/cv/image_processing/2020/11/10/nvidia_dali.html

@VitalyFedyunin @wenleix please chime in on where you think the most natural place for a DALI integration is

msaroufim avatar Apr 26 '22 20:04 msaroufim

The other option is to integrate the DALI data loader as a data pipe in torch.data

Thanks @msaroufim, I had the same feeling about making it as a separate DataPipe because it requires different behavior compared with datapipe.map like making sure this DataPipe only run on single process to prevent cuda context being copied around. It definitely needs more deeper look on DALI itself.

ejguan avatar Apr 26 '22 20:04 ejguan

Seems like there's a good workaround too https://github.com/NVIDIA/DALI/issues/3081#issuecomment-866239816 - I'll take a more thorough look

msaroufim avatar Apr 26 '22 20:04 msaroufim

@msaroufim

to clarify by backend I mean one of these https://github.com/pytorch/vision#image-backend - i.e: pillow, accimage, pillow simd etc..

The new datasets will return a features.EncodedImage, which is a 1D uint8 tensor just storing the raw bytes. You can decode it however you want. Right now, transforms.DecodeImage() uses PIL as backend

https://github.com/pytorch/vision/blob/a8f563dbf8520020054aa01f5ae169999775fd19/torchvision/prototype/transforms/_type_conversion.py#L11-L17

https://github.com/pytorch/vision/blob/a8f563dbf8520020054aa01f5ae169999775fd19/torchvision/prototype/transforms/functional/_type_conversion.py#L13-L17

but you can use arbitrary backends there.

pmeier avatar Apr 27 '22 06:04 pmeier

Similar issue on torchdata repo - https://github.com/pytorch/data/issues/761 Might be good to keep eye on this :)

abhi-glitchhg avatar Sep 06 '22 10:09 abhi-glitchhg