data icon indicating copy to clipboard operation
data copied to clipboard

A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.

Results 302 data issues
Sort by recently updated
recently updated
newest added

### 🚀 The feature MPRS currently looks specifically for ShardingRoundRobinDispatch to determine the non-replicable part of the graph that gets executed in the main process before passing work to worker...

### 📚 The doc issue Merely out of curiosity: Are there any future plans that you are willing to disclose? There still is a link to a Future Plans section...

### 🚀 The feature Given that we will not support for Python 3.7 in future releases, we can utilize [`multiprocessing.shared_memory`](https://docs.python.org/3/library/multiprocessing.shared_memory.html#module-multiprocessing.shared_memory) that was introduced in Python 3.8. It can potentially replaces...

### 🚀 The feature Highlight the fact that the MPRS attaches non-replicable datapipe branches at the end of it in the documentation. Also mention the currently undocumented / obscure `is_replicable()`...

Fixes #454 ### Changes - Only load from datapipe until requested element is loaded - Add test for this behavior

CLA Signed

### 🚀 The feature It would be great if FSSpecFileLister could iteratively list files instead of preloading them all. ### Motivation, pitch With a cloud provider, listing all the files...

Fixes: https://github.com/pytorch/data/issues/335 As draft, since the tests don't run for me at the moment. Also, CLA pending at the moment.

CLA Signed

### 🚀 The feature Beyond the on-disk cache and in-memory cache, it would be useful and performant if a memmap cache (under tensordict https://github.com/pytorch-labs/tensordict/blob/main/tensordict/memmap.py) It would boost better performance due...

### 🚀 The feature Provide a mechanism to catch exception raised by a previous DataPipe and retry. These can be related but separate DataPipes. Non-DataPipes implementation should also be considered....

feature

Please read through our [contribution guide](https://github.com/pytorch/data/blob/main/CONTRIBUTING.md) prior to creating your pull request. - Note that there is a section on requirements related to adding a new DataPipe. Fixes #{issue number}...

CLA Signed