Tom comments

Results 170 comments of

Tom

Checkpoint support

The use case for this seems to me to be very limited, since as soon as you use multiple worker processes, you cannot guarantee restartability. How exactly are you training?...

Updated syntax for multinode usage

WebDataset does not perform node splitting by default, just splitting by workers. It is supposed to output an error when it discovers that it is running multinode, but doesn't seem...

Stream data from Hugging Face

Yes, that would be great! I love Huggingface and use it a lot, also in teaching. The recommended way is indeed a command line program for fetching the data. This...

Stream data from Hugging Face

It is possible to add Python functions to read data. However, the preferred way is via a subprocess, since that gives us asynchronous I/O for free. The UNIX pipeline interface...

Are there plans to support WebP?

You can just add a handler to `webdadaset.writer.default_handlers`, or you can completely override the `encoder` in TarWriter. ImageIO seems to support it natively, so I may add it by default....

how to use num_workers in ddp training?

The long and the short of it is that you probably want to remove the with_epoch from the WebDataset and put it onto WebLoader: ``` data = wds.WebDataset(self.url,resampled=True).shuffle(1000).map(preprocess_train) loader =...

[Distributed inference] How to effectively perform it with torch and webdataset

Depends on the application. If you want to do bulk distributed inference, there are two simple approaches: (1) write a simple shard-to-shard transformation and run that in parallel (2) perform...

issue in tariterators.py

Thanks for spotting this; it's fixed.

resume dataloader

Sorry for the late response... The simplest way of handling distributed training is via resampling. With resampling, there is no need to resume training, since simply restarting the job will...

Arg `repeat` of class `WebDataset` is not used

The repeat= argument is probably just a leftover from when adding the `.repeat()` method. I'll remove it. Thanks for pointing this out.