webdataset
webdataset copied to clipboard
A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.
Right now decoding errors are not reporting much information about what happened. It makes finding and fixing the invalid data quite hard. I think it would be great to improve...
In certain use cases, it would be useful to be able to record when a tar file (out of many) has been fully consumed by `tariterators.tar_file_expander`. For local files, it...
Basically I use webdataset inside my variation of https://github.com/mlfoundations/open_clip. The problem is that if curl inside gopen doesn't get any tar files (due to any network errors) its pipe doesn't...
I creat a webdataset below, where ResampledShards is defiend to repeat tar files to make sure every gpu and every worker could load different tar file. But i found that...
Hi There, I'm getting lots of warnings in my tests when I used `WebDataset` and PyTorch `DataLoader` like the following: ``` /path/envs/python39/lib/python3.9/site-packages/webdataset/tariterators.py:171: ResourceWarning: unclosed file for source in data: ResourceWarning:...
I am trying to use webdataset instead of my custom dataset.However,i am getting loss of performance at training process and also memory usage of GPUs were increased .Is it normal?...
Hi, I'm working on Google Colab and trying to setup minimal example of multi-core pytorch training with webdataset using data on GCP Buckets. Specifically I've got a bucket with my...
i noticed after training for a while i was getting an OSError that there were too many files created. i think i narrowed this down to be an issue with...
In the `FluidInterface` implementation of `to_tuple()` https://github.com/webdataset/webdataset/blob/b092eb6617e090b3e5261ab60e12b263bc107f51/webdataset/compat.py#L55-L56 `**kwargs` also needs to be handed over to `filters.to_tuple` to use e.g., `missing_is_error=False`, but currently not. It seems this has been fixed in...
i'm using the webdataset in ddp training. everything works fine when i set the num_workers 0. but if num_workers > 0,the total steps of an epoch was wrong. ```python dat...