webdataset icon indicating copy to clipboard operation
webdataset copied to clipboard

A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.

Results 185 webdataset issues
Sort by recently updated
recently updated
newest added

Right now decoding errors are not reporting much information about what happened. It makes finding and fixing the invalid data quite hard. I think it would be great to improve...

In certain use cases, it would be useful to be able to record when a tar file (out of many) has been fully consumed by `tariterators.tar_file_expander`. For local files, it...

enhancement
faq

Basically I use webdataset inside my variation of https://github.com/mlfoundations/open_clip. The problem is that if curl inside gopen doesn't get any tar files (due to any network errors) its pipe doesn't...

enhancement

I creat a webdataset below, where ResampledShards is defiend to repeat tar files to make sure every gpu and every worker could load different tar file. But i found that...

enhancement

Hi There, I'm getting lots of warnings in my tests when I used `WebDataset` and PyTorch `DataLoader` like the following: ``` /path/envs/python39/lib/python3.9/site-packages/webdataset/tariterators.py:171: ResourceWarning: unclosed file for source in data: ResourceWarning:...

I am trying to use webdataset instead of my custom dataset.However,i am getting loss of performance at training process and also memory usage of GPUs were increased .Is it normal?...

Hi, I'm working on Google Colab and trying to setup minimal example of multi-core pytorch training with webdataset using data on GCP Buckets. Specifically I've got a bucket with my...

documentation

i noticed after training for a while i was getting an OSError that there were too many files created. i think i narrowed this down to be an issue with...

bug

In the `FluidInterface` implementation of `to_tuple()` https://github.com/webdataset/webdataset/blob/b092eb6617e090b3e5261ab60e12b263bc107f51/webdataset/compat.py#L55-L56 `**kwargs` also needs to be handed over to `filters.to_tuple` to use e.g., `missing_is_error=False`, but currently not. It seems this has been fixed in...

bug

i'm using the webdataset in ddp training. everything works fine when i set the num_workers 0. but if num_workers > 0,the total steps of an epoch was wrong. ```python dat...

faq