data
data copied to clipboard
A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.
### 🐛 Describe the bug ## Describe the bug `torchdata` does not work with torch 2.3.0 because `DILL_AVAILABLE` is not available where expected: ``` Python 3.12.2 (main, Feb 6 2024,...
### 🚀 The feature title ### Motivation, pitch rstrip/lstrip can lead to unintended behaviors ### Alternatives _No response_ ### Additional context _No response_
Within `ProtocolServer` inside `dataloader2/communication/protocol.py`, the exceptions being raised inside are generic `Exception`. Ideally, we should change the exceptions to be more specific, such that they can be handled easily elsewhere....
Fixes #1256 ### Changes * Rename test_state_dict tests so they are easy to shard in CI Actions - -
### 🐛 Describe the bug MacOS tests of StatefulDataLoader CI action fail intermittently during shutdown. on Mac it also takes a lot longer than both windows and ubuntu to shut...
Adding deprecation message about S3 IterDataPipes and pointing to https://github.com/awslabs/s3-connector-for-pytorch. Cc @jamesbornholt, @dnauti
Summary: Allows users to get the worker states out of the state dict. It can be used if users want to modify the state offline. Starts off with very preliminary...
### 🐛 Describe the bug Consider the following code: ``` class DatasetStateIterable(torch.utils.data.IterableDataset, Stateful): def __init__(self, length): self.length = length def __iter__(self): return iter(list(range(self.length))) def state_dict(self): print("Calling state dict") return {"key":...
### 🐛 Describe the bug Google Drive redirects users to a Virus Warining page, when the file size is large and it cannot scan for virus. This results in GDriveReaderDataPipe...
### 🚀 The feature Currently, Saver only allows write mode and only users to choose byte vs text mode. It might be useful to allow the flexibility to append to...