data
data copied to clipboard
DataLoader2 with multiprocess raise exception: Can not request next item while we are still waiting response for previous request
🐛 Describe the bug
dp = ...
dp = dp.sharding_filter()
rs = MultiProcessingReadingService(num_workers=4)
dataloader = DataLoader2(dp, reading_service=rs)
for _ in dataloader:
pass
dataloader.shutdown()
Exception: Can not request next item while we are still waiting response for previous request
This exception is thrown by __iter__ of _IterateQueueDataPipes(datapipes=[QueueWrapper, QueueWrapper, QueueWrapper, QueueWrapper])
Versions
[pip3] numpy==1.26.4
[pip3] onnx==1.15.0
[pip3] onnxconverter-common==1.13.0
[pip3] onnxruntime==1.15.1
[pip3] skl2onnx==1.16.0
[pip3] torch==2.2.1
[pip3] torchaudio==2.2.1
[pip3] torchdata==0.7.1
[pip3] torchvision==0.17.1
[pip3] triton==2.2.0
[conda] numpy 1.26.4 pypi_0 pypi
[conda] torch 2.2.1 pypi_0 pypi
[conda] torchaudio 2.2.1 pypi_0 pypi
[conda] torchvision 0.17.1 pypi_0 pypi
[conda] triton
I have noticed this happens whenever the number of workers you specify for the MultiProcessingReadingService is greater than then the number elements that can be yielded from the dp before sharding.
@jdenhof I chcek it, even with he number of workers specify for the MultiProcessingReadingService is smaller than the number elements that can be yielded from the dp before sharding. Still have this issue.
I also noticed this issue at the end/start of the first epoch. Any fix?
@ds2268 I fixed it by https://github.com/pytorch/data/pull/1311