Erjia Guan
Erjia Guan
Fixes: https://github.com/pytorch/data/issues/718 Stack from [ghstack](https://github.com/ezyang/ghstack): * **#82975 [DataLoader] BC shuffle for MapDataPipe** * #82974 [DataPipe] Align shuffling behavior for IterDataPipe and MapDataPipe Add shuffling logic for `MapDataPipe` when using `DataLoader`
Before this PR, if we lazily read data from each file from Rar Archive, the `fd` would change to wrong location to start reading. After this PR, we reset `fd`...
### 🚀 The feature When we generate the interface of functional API, we can scan `__init__` function for all type annotation and default variables that are imported from somewhere by...
### 🚀 The feature This issue is used to track TODOs in my mind for DevInfra: ### CI - [x] Compare released git_version with the commit on the top of...
### 🚀 The feature Add `SingleProcessReadingService` to adapt the graph for the sake of: - Shuffle seed setting per epoch - Set shuffle/sharding - etc. Pros: This would prevent making...
### 🐛 Describe the bug During the work to fixing the problem with unhashable DataPipe in https://github.com/pytorch/pytorch/pull/80509, I find this test is broken: https://github.com/pytorch/pytorch/blob/e266bea79395399d60bd3c684545f69ae6900236/test/test_datapipe.py#L2307-L2349 The failure is: `TypeError: cannot pickle...
### 📚 The doc issue The examples for Text/Vision/Audio are out-of-date: https://github.com/pytorch/data/tree/main/examples The colab attached in README needs to be updated as well: - How to install torchdata - Example...
### 🐛 Describe the bug I am trying to understand what would be the expected result when we traverse a `DataPipe` graph containing circular references. Assume we have the following...
### 🐛 Describe the bug There are a couple of places that `DataLoader2` uses `IterDataPipe` as the type hint ([here](https://github.com/pytorch/data/blob/12cfaf8899b1337981cd4edf9deef127f925f1bd/torchdata/dataloader2/dataloader2.py#L22), [here](https://github.com/pytorch/data/blob/12cfaf8899b1337981cd4edf9deef127f925f1bd/torchdata/dataloader2/reading_service.py#L17), etc.) We should add a type called `DataPipe =...