Erjia Guan

Results 170 comments of Erjia Guan

Correct me if I am wrong. metadata datapipe can be treated as a Sampler to provide indexes or file names. Then, the data-reading DataPipe would take them as inputs and...

I will try to write an example for you about how it's going to be based on my understanding of this workflow.

Here is the example: https://gist.github.com/ejguan/e9a2ac94c276babae76f7dbd2a251180 Really appreciate any feedback or question.

@pzelasko > but how does GroupBy "know" it has gathered all the possible elements? I.e. how does it know not to wait indefinitely? That's actually a great feedback. I was...

It's done in both `DistributedReadingService` and `PrototypingMultiprocessingReadingService`. Closing now

Should we close this Issue as https://github.com/pytorch/data/pull/843 is landed? Or, you want to have a specific tutorial about splitting datapipe

This is awesome. One nit note: serializable should be same as picklable IMO.

@NivekT I am concerning about when and how we want to do graph testing. For a single DataPipe instance, the graph testing makes no sense. Then, we may want to...

When we have time, we might need to go over our DataPipes again to identify any missing test since there are a few DataPipe implemented recently. Besides, for future reference,...

So, I guess you want a `DataPipe` behaves differently based `WorkerInfo`. I think adding `get_worker_info` is a good feature request. However, `set_worker_info` to each `DataPipe` might be too much as...