Possible unhandled error from worker: ray::ParallelIteratorWorker.par_iter_next_batch()
The following erros are just error prints. It is a bug in ray and will be fixed in future.
2020-12-01 20:44:59,081 ERROR worker.py:977 -- Possible unhandled error from worker: ray::ParallelIteratorWorker.par_iter_next_batch() (pid=24362, ip=192.168.3.6)
File "python/ray/_raylet.pyx", line 464, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 419, in ray._raylet.execute_task.function_executor
File "/Users/xianyang/miniconda3/envs/torch/lib/python3.7/site-packages/ray/util/iter.py", line 1158, in par_iter_next_batch
batch.append(self.par_iter_next())
File "/Users/xianyang/miniconda3/envs/torch/lib/python3.7/site-packages/ray/util/iter.py", line 1152, in par_iter_next
return next(self.local_it)
StopIteration
Hi, I also got this error, is there a bug report for Ray?
I am getting these errors too. Torch also complains about the input tensor size (I am running the NYC taxi fare prediction example). Any idea why is this happening?
(pid=56035) 2021-02-23 12:26:00,645 INFO distributed_torch_runner.py:58 -- Setting up process group for: tcp://9.1.44.100:55874 [rank=1] (pid=56022) 2021-02-23 12:26:00,643 INFO distributed_torch_runner.py:58 -- Setting up process group for: tcp://9.1.44.100:55874 [rank=0] (pid=56035) /home/guryaniv/anaconda3/envs/raydp/lib/python3.6/site-packages/torch/nn/modules/loss.py:822: UserWarning: Using a target size (torch.Size([256, 1])) that is different to the input size (torch.Size([256])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size. (pid=56035) return F.smooth_l1_loss(input, target, reduction=self.reduction, beta=self.beta) (pid=56022) /home/guryaniv/anaconda3/envs/raydp/lib/python3.6/site-packages/torch/nn/modules/loss.py:822: UserWarning: Using a target size (torch.Size([256, 1])) that is different to the input size (torch.Size([256])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size. (pid=56022) return F.smooth_l1_loss(input, target, reduction=self.reduction, beta=self.beta) Epoch-0: {'num_samples': 1737186, 'epoch': 1.0, 'batch_count': 3393.0, 'train_loss': 5.325447512030797, 'last_train_loss': 5.227337598800659} (pid=56035) /home/guryaniv/anaconda3/envs/raydp/lib/python3.6/site-packages/torch/nn/modules/loss.py:822: UserWarning: Using a target size (torch.Size([241, 1])) that is different to the input size (torch.Size([241])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size. (pid=56035) return F.smooth_l1_loss(input, target, reduction=self.reduction, beta=self.beta) (pid=56022) /home/guryaniv/anaconda3/envs/raydp/lib/python3.6/site-packages/torch/nn/modules/loss.py:822: UserWarning: Using a target size (torch.Size([241, 1])) that is different to the input size (torch.Size([241])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size. (pid=56022) return F.smooth_l1_loss(input, target, reduction=self.reduction, beta=self.beta) 2021-02-23 12:28:33,382 ERROR worker.py:1053 -- Possible unhandled error from worker: ray::ParallelIteratorWorker.par_iter_next_batch() (pid=56049, ip=9.1.44.100) File "python/ray/_raylet.pyx", line 480, in ray._raylet.execute_task File "python/ray/_raylet.pyx", line 432, in ray._raylet.execute_task.function_executor File "/home/guryaniv/anaconda3/envs/raydp/lib/python3.6/site-packages/ray/util/iter.py", line 1158, in par_iter_next_batch batch.append(self.par_iter_next()) File "/home/guryaniv/anaconda3/envs/raydp/lib/python3.6/site-packages/ray/util/iter.py", line 1152, in par_iter_next return next(self.local_it) StopIteration 2021-02-23 12:28:33,384 ERROR worker.py:1053 -- Possible unhandled error from worker: ray::ParallelIteratorWorker.par_iter_next_batch() (pid=56014, ip=9.1.44.100) File "python/ray/_raylet.pyx", line 480, in ray._raylet.execute_task File "python/ray/_raylet.pyx", line 432, in ray._raylet.execute_task.function_executor File "/home/guryaniv/anaconda3/envs/raydp/lib/python3.6/site-packages/ray/util/iter.py", line 1158, in par_iter_next_batch batch.append(self.par_iter_next()) File "/home/guryaniv/anaconda3/envs/raydp/lib/python3.6/site-packages/ray/util/iter.py", line 1152, in par_iter_next return next(self.local_it) StopIteration
Hi @yanivg10, it is just the exception print. The actual exception has been caught, you can ignore it. The ray community is working on fixing it.
close as stale