data
data copied to clipboard
A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.
### π Describe the bug When multiple workers try to create a folder structure at the same time it fails and raises `FileExistsError`. Expected behavior: create folder structure and continue...
Fixed bug where multiple workers try to write to the same folder that doesn't exist yet, it causes a FileExistsError because a different work already created the directory structure. Fixes...
### π The feature [pypeln](https://cgarciae.github.io/pypeln/#mixed-pipelines) has a nice feature to chain pipelines which may run on different kind of workers including process, thread or asyncio. ```python data = ( range(10)...
### π The feature Add a progress bar to remote DataPipes that will be shown in the terminal to display the status of the operation. We can potentially use [`tqdm`](https://github.com/tqdm/tqdm)...
### π Describe the bug Iβve noticed large βspikesβ in memory usage at the start of epochs when using IterDataPipes with attributes that take a lot of memory. These can...
### π The doc issue https://pytorch.org/data/0.6/generated/torchdata.datapipes.iter.MultiplexerLongest.html#torchdata.datapipes.iter.MultiplexerLongest The snippet in the docs ``` >>> from torchdata.datapipes.iter import IterableWrapper >>> dp1, dp2, dp3 = IterableWrapper(range(5)), IterableWrapper(range(10, 15)), IterableWrapper(range(20, 25)) >>> list(dp1.mux_longest(dp2, dp3))...
### π Describe the bug I tried all these versions, the only version that worked was the last one, but it's too hacky. Is there a better way? ```py dp...
### π Describe the bug When using `ShardingFilterIterDataPipe`, the data in the datapipe will be evenly sharded to `num_of_instances` workers. However, if we called `batch()` later on the datapipe, the...
### Changes - added the option to recursively traverse a given path
### π The feature I would like to easily be able to create ``m`` dispatching processes feeding ``n`` worker processes. Currently you can only have a single dispatching process. In...