data
data copied to clipboard
A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.
This issue is generated from the TODO line https://github.com/pytorch/data/blob/2f29adba451e1b87f1c0c654557d9dd98673fdd8/./test/test_serialization.py#L206
This issue is generated from the TODO line https://github.com/pytorch/data/blob/2f29adba451e1b87f1c0c654557d9dd98673fdd8/./test/test_dataloader2.py#L164
This issue is generated from the TODO line https://github.com/pytorch/data/blob/2f29adba451e1b87f1c0c654557d9dd98673fdd8/./test/test_dataloader2.py#L160 cc @VitalyFedyunin
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #585 ```python import torchtext train_iter, val_iter, test_iter = \ torchtext.datasets.IWSLT2017("data/IWSLT2017", language_pair=('en', 'de')) print(next(iter(train_iter))) ```
### 🐛 Describe the bug I am trying to understand what would be the expected result when we traverse a `DataPipe` graph containing circular references. Assume we have the following...
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #485 * #484 Differential Revision: [D36866290](https://our.internmc.facebook.com/intern/diff/D36866290)
https://github.com/pytorch/data/blob/cbbad752e25e709a36d6ca16436f44fe0c085d82/torchdata/dataloader2/dataloader2.py#L67 See: https://github.com/pytorch/data/pull/571#issuecomment-1179087697
### 🚀 The feature In https://github.com/pytorch/pytorch/pull/56497 we tried to introduce fusing of sequential `map` calls into one MapDataPipe, however it doesn't respect `input_col` and other kwargs. We can refactor MapperIterDataPipe...
This is somewhat related to https://github.com/pytorch/data/issues/533 As described in https://github.com/pytorch/data/issues/533#issuecomment-1163381945, we like to check the `len()` of the DataLoader in torchvision in our logging utils. Are there plans to implement...
### 🐛 Describe the bug I ran a series of speed benchmarks comparing 3 methods of building a `datapipe`. This is a continuation of the conversation from https://github.com/pytorch/data/issues/454#issuecomment-1141858156 Brief context:...