Add support for sharding filter in distributed settings
🚀 The feature
Implement a distributed_sharding_filter that would behave similar to sharding_filter (https://github.com/pytorch/pytorch/blob/3f140c5b32fa8685cc7a10bdb94f3f8b127e3a92/torch/utils/data/datapipes/iter/grouping.py), but would filter according to global rank and world size if torch.distributed was initialized. If torch.distributed was not initialized, this filter would do nothing.
Motivation, pitch
I am running a distributed training, and I had to write this filter myself. I think it is required in most distributed training scenarios and it would make a lot of sense to add this into the library.
Alternatives
An alternative implementation would extend the sharding_filter already present in PyTorch core library. However, this could break backward compatibility?
Additional context
No response
Thanks for asking it. We understand this need and we are working on it. We are currently working on DataLoader2 to handle dynamic sharding using the sharding_filter for both MP / Distributed scenario.
Should I close this issue or should I link a feature request from the PyTorch repository?
No need to close it. We will keep you updated when this feature is landed.
Just want to update here. If you have sharding_filter in your pipeline, the DataPipe graph should be dynamically sharded using DataLoader.
You can either use nightly release of PyTorch Core and TorchData or wait for a weeks for the coming official release.
Please take a look at tutorial
And, we will keep working on DataLoader2 for better support of parallelism execution and other features like snapshotting. Please stay tuned.
It's done in both DistributedReadingService and PrototypingMultiprocessingReadingService. Closing now