Erjia Guan

Results 170 comments of Erjia Guan

Technically speaking, you should use `DataLoader` to work with DataPipe to enable shuffling with different randomness of every epoch. But, there are a few bugs in `DataLoader` and `map.SequenceWrapper`, I...

Could you please try to use `IterDataPipe` via `dp.iter.IterableWrapper` and provide the datapipe to `DataLoader` as a temporary workaround?

The `shuffle` for `IterDataPipe` has to be a buffered shuffle as there isn't a concept of indices. So, in order to achieve global shuffle, you have to provide the size...

@DongyuXu77 Thank you for asking. This is actually blocked by determinism support for DataLoader2, which I am currently working on. I will assign the Issue to myself.

Tests are completed. But, I still need to remove the `TODO` comment in the test

> Looks like pin_memory should be a parameter of ReadingService, otherwise 'pin' gets lost when tensors are moved from child processes to the main training loop. > > CC @ejguan...

Thank you for opening the issue. It's kind easy to fix but need to consider all the use cases. If users don't specify `sharding_filter` in the pipeline, the length should...

@pmeier Sorry I missed this issue before. We are currently exploring a plan to add a `BufferSpec` class (configuration) for buffer. We should provide an instance for default value of...

> > We are currently exploring a plan to add a `BufferSpec` class (configuration) for buffer > > Is there a design document / issue for this? If not, what...