[feature request] Upstream to core PyTorch `StatefulDataLoader` and `StatefulDistributedSampler`
🚀 The feature
Being able to precisely recover the state of data loading is a popular feature. Would be great to have it in core to increase visibility of its existence :)
Motivation, pitch
N/A
Alternatives
No response
Additional context
No response
Hi @vadimkantorov We are currently working on upstreaming it
https://github.com/ramanishsingh/pytorch/tree/upstream_sdl
It would also be great to add a DistributedSamplerWrapper (to wrap existing RandomSampler and SequentialSampler or any other custom Sampler - like in https://github.com/catalyst-team/catalyst/blob/master/catalyst/data/sampler.py#L499). Currently for some reason DistributedSampler mixes both together...
And another old thing is to have some native utils for handling arrays of strings as tensors (friendly to DataLoader):
- https://github.com/pytorch/pytorch/issues/101699