Andrew Ho
Andrew Ho
Hi y'all, @lhoestq I wanted to flag that we currently have a StatefulDataLoader in `pytorch/data/torchdata` that has state_dict/load_state_dict methods, which will call a dataset's state_dict/load_state_dict methods but also handle multiprocessing...
@lhoestq Good find, we are in the midst of updating this disclaimer as we're re-starting development and regular releases, though our approach will be to iterate on DL V1 (ie...
I'm not 100% sure, but I guess this issue comes from building from source. If you `pip download --no-binary :all: --no-dependencies dfply` you'll find the same issue, the `diamonds.csv` file...
Just wanted to chime in that I have also come across this bug, same scenario when using `mask` except my case was e.g. `mask(X.bool_col1 & (~X.bool_col2))`
Also wanted to add that in the case of `&`, you can use `mask(condA, ~condB)`, and alternatively, the `-` sign for inversion also works, e.g. `mask(condA & -condB)`
@stas00 stateful dataloader will save and resume samplers for map style datasets. If no state_dict/load_state_dict is provided by the sampler, it will naively skip samples to fast forward. See here...
@bhack just catching up on this, is this a regression? Is this related to the dataloader?
@bhack sorry I'm a little dense, but what's the issue with too many threads? Can you also create a minimal-repro that shows the difference between old and new releases? By...
@bhack a minimal repro would still be appreciated
@bhack checkout this PR, it will hopefully help us isolate where the process/threads are coming from https://github.com/pytorch/pytorch/pull/128448