Andrew Ho comments

Results 32 comments of


                                            Andrew Ho

doc: `torch.utils.data.Sampler`: `len` is optional

@pytorchbot merge

State_dict on dataset seems to be called more often than expected

This is expected because we need to eagerly request state_dict from workers and have no idea if other workers are sending StopIterations, so we need to ask for more than...

Make DistributedSampler stateful

This currently isn't broken right? ie fast-forwarding the sampler will work, but may be inefficient. I'm OK either way for before/after release branch cut

Make DistributedSampler stateful

HI @ShoufaChen you're correct, it should work without modifications but may be slow for large tables. https://github.com/pytorch/data/blob/main/torchdata/stateful_dataloader/sampler.py#L47 Here is where we've done the conversion for RandomSampler and BatchSampler as examples....

Future of torchdata and dataloading

Hi everyone, we’d like to share an update about how we plan to use the pytorch/data repo going forward. We will be focusing our efforts on a more iterative approach...

what dataloader to use for torchdata.nodes nodes?

@keunwoochoi thanks for trying this out! We shoudl clarify this in the documentation, but right now the idea is that torchdata.nodes is a super-set of StatefulDataLoader, ie nodes should be...

[Discussion] `def Mapper` may not be ideal

HI @keunwoochoi I think this makes a lot of sense, generally I personally try to avoid inheritance whenever possible but IMO this is a reasonable use and we could land...

best practice for `snapshot_every_n_steps`

Hi @ShoufaChen thanks for the issue, we should update the documentation to explain this better. To answer your questions: it depends mainly on the size and composition of your state....

best practice for `snapshot_every_n_steps`

@yzhangcs you can still request a checkpoint/state_dict at any time, the dataloader will load the last snapshot and "fast forward" the required steps to get to the correct point. In...

RFC-0001-economic-dataloader.md

Hi @yoadbs thank you for this thoughtful RFC! I just had a quick look but this looks like it would be covered by some of our plans in torchdata to...