data icon indicating copy to clipboard operation
data copied to clipboard

A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.

Results 302 data issues
Sort by recently updated
recently updated
newest added

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #824 * #823

CLA Signed

### 🐛 Describe the bug We do `finalize_iteration` when `__next__` is called for the invalid `DataLoader2Iterator ` (let's call it Iter_A): https://github.com/pytorch/data/blob/c42587a828d05f24f6f0586d17d3e9d55e1433ed/torchdata/dataloader2/dataloader2.py#L67-L68 However, as a new `DataLoader2Iterator` (Iter_B) has been...

### 🐛 Describe the bug Per title: https://github.com/pytorch/data/blob/c42587a828d05f24f6f0586d17d3e9d55e1433ed/torchdata/dataloader2/dataloader2.py#L176-L183 Even though the `ReadingService` should guarantee the `finalize_iteration` called in `finalize`, it's better to guard this behavior in `DataLoader2` as well. ###...

Cherry-pick of planned changes to ProtoRS

CLA Signed

Summary: Add the initial support for DataLoader2 to control randomness over the pipeline: - Implement `SeedGenerator` - Change API of `ReadingService.initialize_iteration` to take seed generator from DataLoader2 - The seed...

CLA Signed
fb-exported

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #809 * #652 * #562 This reverts commit 6b737118df146f7daa4b15b3dea3dd76f6bca1cf.

CLA Signed

### 🚀 The feature `S3FileLister` and `S3FileLoader` currently doesn't support keyword argument beyond `request_timeout_ms`, `region`, `buffer_size`, and `multi_part_download`. Th One example is [here](https://discuss.pytorch.org/t/bucket-versioning-using-s3fileloader/162699/2), where a user would like to read...

enhancement

Stack from [ghstack](https://github.com/ezyang/ghstack): * __->__ #785 Fixes #625 Fixes #627 Fixes #628

CLA Signed

### 🚀 The feature We already have an S3 integration and it seems like the S3 API already works with both * Azure: https://devblogs.microsoft.com/cse/2016/05/22/access-azure-blob-storage-from-your-apps-using-s3-api/ * GCP: https://vamsiramakrishnan.medium.com/a-study-on-using-google-cloud-storage-with-the-s3-compatibility-api-324d31b8dfeb ### Motivation, pitch...

Originally I was expecting the returned stream from `S3handler` is non-seekable stream. But, it turns out that the whole archive/files will be dumped into memory based on the implementation (I...