data icon indicating copy to clipboard operation
data copied to clipboard

A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.

Results 302 data issues
Sort by recently updated
recently updated
newest added

### πŸ› Describe the bug When multiple workers try to create a folder structure at the same time it fails and raises `FileExistsError`. Expected behavior: create folder structure and continue...

Fixed bug where multiple workers try to write to the same folder that doesn't exist yet, it causes a FileExistsError because a different work already created the directory structure. Fixes...

CLA Signed

### πŸš€ The feature [pypeln](https://cgarciae.github.io/pypeln/#mixed-pipelines) has a nice feature to chain pipelines which may run on different kind of workers including process, thread or asyncio. ```python data = ( range(10)...

### πŸš€ The feature Add a progress bar to remote DataPipes that will be shown in the terminal to display the status of the operation. We can potentially use [`tqdm`](https://github.com/tqdm/tqdm)...

### πŸ› Describe the bug I’ve noticed large β€œspikes” in memory usage at the start of epochs when using IterDataPipes with attributes that take a lot of memory. These can...

### πŸ“š The doc issue https://pytorch.org/data/0.6/generated/torchdata.datapipes.iter.MultiplexerLongest.html#torchdata.datapipes.iter.MultiplexerLongest The snippet in the docs ``` >>> from torchdata.datapipes.iter import IterableWrapper >>> dp1, dp2, dp3 = IterableWrapper(range(5)), IterableWrapper(range(10, 15)), IterableWrapper(range(20, 25)) >>> list(dp1.mux_longest(dp2, dp3))...

### πŸ› Describe the bug I tried all these versions, the only version that worked was the last one, but it's too hacky. Is there a better way? ```py dp...

### πŸ› Describe the bug When using `ShardingFilterIterDataPipe`, the data in the datapipe will be evenly sharded to `num_of_instances` workers. However, if we called `batch()` later on the datapipe, the...

### Changes - added the option to recursively traverse a given path

CLA Signed

### πŸš€ The feature I would like to easily be able to create ``m`` dispatching processes feeding ``n`` worker processes. Currently you can only have a single dispatching process. In...