PaddleSeg icon indicating copy to clipboard operation
PaddleSeg copied to clipboard

为什么训练时出现这种警告,特别长一串红色warnning

Open loxoo6 opened this issue 1 year ago • 45 comments

问题确认 Search before asking

Bug描述 Describe the Bug

Uploading image.png…

复现环境 Environment

paddlepaddle:2.3.2 paddleseg:2.7

Bug描述确认 Bug description confirmation

  • [X] 我确认已经提供了Bug复现步骤、代码改动说明、以及环境信息,确认问题是可以复现的。I confirm that the bug replication steps, code change instructions, and environment information have been provided, and the problem can be reproduced.

是否愿意提交PR? Are you willing to submit a PR?

  • [X] 我愿意提交PR!I'd like to help by submitting a PR!

loxoo6 avatar Jul 04 '23 12:07 loxoo6

你好,图片链接失效

Asthestarsfalll avatar Jul 05 '23 12:07 Asthestarsfalll

你好,图片链接失效

我也遇到了,这样的: I0706 13:09:12.768538 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:13.075042 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:13.382442 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:13.688205 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:13.996582 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:14.305588 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:14.611194 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:14.919446 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:15.227041 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:15.533654 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. 2023-07-06 13:09:15 [INFO] [TRAIN] epoch: 1, iter: 60/40000, loss: 1.8980, lr: 0.009987, batch_cost: 0.3071, reader_cost: 0.00020, ips: 13.0255 samples/sec | ETA 03:24:25

可是代码里没有用Tensor.numpy()[0]啊

gitlonglong avatar Jul 06 '23 05:07 gitlonglong

降低一些版本就行了 paddlepaddle/paddle:2.4.2-gpu-cuda11.7-cudnn8.4-trt8.4

loxoo6 avatar Jul 07 '23 00:07 loxoo6

可参考:https://github.com/PaddlePaddle/PaddleOCR/issues/10302

shiyutang avatar Jul 07 '23 01:07 shiyutang

以上回答已经充分解答了问题,如果有新的问题欢迎随时提交issue,或者在此条issue下继续回复~ 我们开启了飞桨套件的ISSUE攻关活动,欢迎感兴趣的开发者参加:PaddlePaddle/PaddleOCR#10223

shiyutang avatar Jul 07 '23 01:07 shiyutang

你好,图片链接失效

我也遇到了,这样的: I0706 13:09:12.768538 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:13.075042 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:13.382442 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:13.688205 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:13.996582 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:14.305588 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:14.611194 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:14.919446 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:15.227041 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:15.533654 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. 2023-07-06 13:09:15 [INFO] [TRAIN] epoch: 1, iter: 60/40000, loss: 1.8980, lr: 0.009987, batch_cost: 0.3071, reader_cost: 0.00020, ips: 13.0255 samples/sec | ETA 03:24:25

可是代码里没有用Tensor.numpy()[0]啊

请问你是训练的是什么模型呢?这里只是抛出了警告:目前版本会隐式将0-D tensor转换为1-D tensor,2.6以后的版本将会直接抛出错误。实际并不影响训练过程。

Asthestarsfalll avatar Jul 07 '23 04:07 Asthestarsfalll

你好,图片链接失效

我也遇到了,这样的: I0706 13:09:12.768538 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:13.075042 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:13.382442 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:13.688205 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:13.996582 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:14.305588 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:14.611194 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:14.919446 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:15.227041 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0706 13:09:15.533654 3772 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. 2023-07-06 13:09:15 [INFO] [TRAIN] epoch: 1, iter: 60/40000, loss: 1.8980, lr: 0.009987, batch_cost: 0.3071, reader_cost: 0.00020, ips: 13.0255 samples/sec | ETA 03:24:25 可是代码里没有用Tensor.numpy()[0]啊

请问你是训练的是什么模型呢?这里只是抛出了警告:目前版本会隐式将0-D tensor转换为1-D tensor,2.6以后的版本将会直接抛出错误。实际并不影响训练过程。

我是在AIstudio中直接用paddleseg训练voc12数据集时出错,2.5及以上版本都会有这个警告,2.4及以下的没试过。使用‘export FLAGS_set_to_1d=False’就没有这个警告信息了,但是验证的时候会报错。 报错发现是对一个数而不是一个列表去取它的第0个元素: 1./home/aistudio/PaddleSeg/paddleseg/core/train.py 文件内'avg_loss += loss.numpy()[0]'改成‘avg_loss += loss.numpy()’ 2.‘avg_loss_list = [l[0] / log_iters for l in avg_loss_list]’也是要改成‘avg_loss_list = [l / log_iters for l in avg_loss_list]’。 3.metrics.py文件里的‘pred_area.append(paddle.sum(paddle.cast(pred_i, "int32")))’也要改成‘pred_area.append(paddle.sum(paddle.cast(pred_i, "int32")).unsqueeze(0))’ 这样才能正常运行。

我是直接下载的paddleseg,唯一的修改是将/home/aistudio/PaddleSeg/configs/deeplabv3p/deeplabv3p_resnet50_os8_voc12aug_512x512_40k.yml文件里的‘base: '../base/pascal_voc12aug.yml'修改为了‘base: '../base/pascal_voc12.yml'’。 其他数据集还没试过,在voc12数据集会这样。

另外使用‘pascal_voc12aug.yml’的时候是要先运行/home/aistudio/PaddleSeg/tools/voc_augment.py文件是吗?但我运行时总会报网络连接错误,下载不下来benchmark.tgz,正在考虑浏览器下载了再当做数据集传进去。

gitlonglong avatar Jul 07 '23 10:07 gitlonglong

@gitlonglong

  1. 请问你使用的paddleseg版本是多少呢?.numpy()[0]的问题应该在这个PR中修复了;
  2. benchmark.tgz可以在aistudio的数据集中找一下,我找了一个,你可以试试看能不能用,直接添加到数据中就行了:https://aistudio.baidu.com/aistudio/datasetdetail/65497。

Asthestarsfalll avatar Jul 07 '23 10:07 Asthestarsfalll

@gitlonglong

  1. 请问你使用的paddleseg版本是多少呢?.numpy()[0]的问题应该在这个PR中修复了;
  2. benchmark.tgz可以在aistudio的数据集中找一下,我找了一个,你可以试试看能不能用,直接添加到数据中就行了:https://aistudio.baidu.com/aistudio/datasetdetail/65497。

@Asthestarsfalll 好的,我试试,多谢! 我一开始用的2.6,后来换2.8了。而且我又看了一下,2.7和2.8是修复了这个bug,但是用voc数据集的时候还是会报错,在train.py里的这一行:avg_loss_list = [l[0] / log_iters for l in avg_loss_list](l[0]的写法会报错,换成l就可以) 另外metrics.py文件里也会报错,像我上面说的那样改才能在VOC12上跑得起来。

gitlonglong avatar Jul 07 '23 13:07 gitlonglong

@gitlonglong 方便的话可以分享一下aistudio的项目,我来排查一下问题所在

Asthestarsfalll avatar Jul 08 '23 07:07 Asthestarsfalll

我也出现了这个问题,paddleseg版本为2.8

I0712 14:24:21.198009 19039 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0712 14:24:21.626497 19039 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0712 14:24:21.626669 19039 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0712 14:24:21.626710 19039 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0712 14:24:22.062738 19039 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0712 14:24:22.062984 19039 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0712 14:24:22.063066 19039 eager_method.cc:140] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6.

a-strong-python avatar Jul 12 '23 06:07 a-strong-python

当程序运行到第一次保存模型的轮数时,会报下面的错误 2023-07-12 14:26:44 [INFO] [TRAIN] epoch: 17, iter: 500/10000, loss: 1.8821, lr: 0.000972, batch_cost: 0.4265, reader_cost: 0.30989, ips: 14.0681 samples/sec | ETA 01:07:31 2023-07-12 14:26:44 [INFO] Start evaluating (total_samples: 45, total_iters: 45)... Traceback (most recent call last): File "/home/aistudio/PaddleSeg/tools/train.py", line 195, in main(args) File "/home/aistudio/PaddleSeg/tools/train.py", line 170, in main train( File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddleseg/core/train.py", line 315, in train mean_iou, acc, _, _, _ = evaluate( File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddleseg/core/val.py", line 161, in evaluate intersect_area, pred_area, label_area = metrics.calculate_area( File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddleseg/utils/metrics.py", line 57, in calculate_area pred_area = paddle.concat(pred_area) File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/tensor/manipulation.py", line 1121, in concat return _C_ops.concat(input, axis) ValueError: (InvalidArgument) The axis is expected to be in range of [0, 0), but got 0 [Hint: Expected axis >= -rank && axis < rank == true, but received axis >= -rank && axis < rank:0 != true:1.] (at ../paddle/phi/infermeta/multiary.cc:961)

a-strong-python avatar Jul 12 '23 06:07 a-strong-python

2023-07-12 16:31:44 [INFO] [TRAIN] epoch: 12, iter: 500/1000, loss: 0.3831, lr: 0.000537, batch_cost: 0.1358, reader_cost: 0.07347, ips: 29.4629 samples/sec | ETA 00:01:07 2023-07-12 16:31:44 [INFO] Start evaluating (total_samples: 45, total_iters: 45)... Traceback (most recent call last): File "/home/aistudio/PaddleSeg/tools/train.py", line 195, in main(args) File "/home/aistudio/PaddleSeg/tools/train.py", line 170, in main train( File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddleseg/core/train.py", line 315, in train mean_iou, acc, _, _, _ = evaluate( File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddleseg/core/val.py", line 161, in evaluate intersect_area, pred_area, label_area = metrics.calculate_area( File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddleseg/utils/metrics.py", line 57, in calculate_area pred_area = paddle.concat(pred_area) File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/tensor/manipulation.py", line 1121, in concat return _C_ops.concat(input, axis) ValueError: (InvalidArgument) The axis is expected to be in range of [0, 0), but got 0 [Hint: Expected axis >= -rank && axis < rank == true, but received axis >= -rank && axis < rank:0 != true:1.] (at ../paddle/phi/infermeta/multiary.cc:961)

a-strong-python avatar Jul 12 '23 08:07 a-strong-python

2023-07-12 16:31:44 [INFO] [TRAIN] epoch: 12, iter: 500/1000, loss: 0.3831, lr: 0.000537, batch_cost: 0.1358, reader_cost: 0.07347, ips: 29.4629 samples/sec | ETA 00:01:07 2023-07-12 16:31:44 [INFO] Start evaluating (total_samples: 45, total_iters: 45)... Traceback (most recent call last): File "/home/aistudio/PaddleSeg/tools/train.py", line 195, in main(args) File "/home/aistudio/PaddleSeg/tools/train.py", line 170, in main train( File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddleseg/core/train.py", line 315, in train mean_iou, acc, _, _, _ = evaluate( File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddleseg/core/val.py", line 161, in evaluate intersect_area, pred_area, label_area = metrics.calculate_area( File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddleseg/utils/metrics.py", line 57, in calculate_area pred_area = paddle.concat(pred_area) File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/tensor/manipulation.py", line 1121, in concat return _C_ops.concat(input, axis) ValueError: (InvalidArgument) The axis is expected to be in range of [0, 0), but got 0 [Hint: Expected axis >= -rank && axis < rank == true, but received axis >= -rank && axis < rank:0 != true:1.] (at ../paddle/phi/infermeta/multiary.cc:961)

看起来是pred_area都是0dim tensor,可以重新提一个issue

Asthestarsfalll avatar Jul 12 '23 10:07 Asthestarsfalll

以上回答已经充分解答了问题,如果有新的问题欢迎随时提交issue,或者在此条issue下继续回复~ 我们开启了飞桨套件的ISSUE攻关活动,欢迎感兴趣的开发者参加:https://github.com/PaddlePaddle/PaddleOCR/issues/10223

ToddBear avatar Jul 13 '23 12:07 ToddBear

@gitlonglong 方便的话可以分享一下aistudio的项目,我来排查一下问题所在

@Asthestarsfalll 不用我的aistudio项目,你随便创一个项目,10分钟应该就能弄好,跑一下就能发现问题了。

gitlonglong avatar Jul 14 '23 11:07 gitlonglong

2023-07-12 16:31:44 [INFO] [TRAIN] epoch: 12, iter: 500/1000, loss: 0.3831, lr: 0.000537, batch_cost: 0.1358, reader_cost: 0.07347, ips: 29.4629 samples/sec | ETA 00:01:07 2023-07-12 16:31:44 [INFO] Start evaluating (total_samples: 45, total_iters: 45)... Traceback (most recent call last): File "/home/aistudio/PaddleSeg/tools/train.py", line 195, in main(args) File "/home/aistudio/PaddleSeg/tools/train.py", line 170, in main train( File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddleseg/core/train.py", line 315, in train mean_iou, acc, _, _, _ = evaluate( File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddleseg/core/val.py", line 161, in evaluate intersect_area, pred_area, label_area = metrics.calculate_area( File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddleseg/utils/metrics.py", line 57, in calculate_area pred_area = paddle.concat(pred_area) File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/tensor/manipulation.py", line 1121, in concat return _C_ops.concat(input, axis) ValueError: (InvalidArgument) The axis is expected to be in range of [0, 0), but got 0 [Hint: Expected axis >= -rank && axis < rank == true, but received axis >= -rank && axis < rank:0 != true:1.] (at ../paddle/phi/infermeta/multiary.cc:961)

@a-strong-python 按照我上面说的改,给升一个维度,就可以成功运行了。像这样:pred_area.append(paddle.sum(paddle.cast(pred_i, "int32")).unsqueeze(0))

gitlonglong avatar Jul 14 '23 11:07 gitlonglong

我都改了,还是报错的,直接跑的官方例子,不知道出了啥问题

a-strong-python avatar Jul 15 '23 14:07 a-strong-python

@gitlonglong @a-strong-python

似乎是PYPI上的paddleseg有问题,如果是通过源码安装则没有这个问题

Asthestarsfalll avatar Aug 16 '23 04:08 Asthestarsfalll

@gitlonglong 我使用源码安装的2.8版本,按照您的方法进行修改,但是在自己的数据集上训练时仍会出现这种warning

KKWY0909 avatar Aug 21 '23 03:08 KKWY0909

@gitlonglong 我使用源码安装的2.8版本,按照您的方法进行修改,但是在自己的数据集上训练时仍会出现这种warning

@KKWY0909 源码安装之后调用的函数是原本的固定的了,而不是你修改后的。有两个方法解决,第一个方法是每次修改后重新安装一遍,第二种是使用低一些的版本,比如2.6,每次修改就可以直接用

gitlonglong avatar Aug 21 '23 03:08 gitlonglong

可参考:PaddlePaddle/PaddleOCR#10302

这个参考链接没有任何价值啊!!! 根本不是针对那个警告的

Ericgone avatar Aug 24 '23 02:08 Ericgone

@Ericgone 这里只是抛出了警告:目前版本会隐式将0-D tensor转换为1-D tensor,2.6以后的版本将会直接抛出错误。实际并不影响训练过程。

可以通过export FLAGS_set_to_1d=False来忽略警告

Asthestarsfalll avatar Aug 24 '23 10:08 Asthestarsfalll

@Ericgone 这里只是抛出了警告:目前版本会隐式将0-D tensor转换为1-D tensor,2.6以后的版本将会直接抛出错误。实际并不影响训练过程。

可以通过export FLAGS_set_to_1d=False来忽略警告

这个警告会把你的训练log日志占满,可以控制它只出现一次吗?

TerryBryant avatar Aug 30 '23 11:08 TerryBryant

@Ericgone 这里只是抛出了警告:目前版本会隐式将0-D tensor转换为1-D tensor,2.6以后的版本将会直接抛出错误。实际并不影响训练过程。 可以通过export FLAGS_set_to_1d=False来忽略警告

这个警告会把你的训练log日志占满,可以控制它只出现一次吗?

  1. 尝试使用 export FLAGS_set_to_1d=False
  2. 更换更低版本的paddle

Asthestarsfalll avatar Aug 30 '23 13:08 Asthestarsfalll

@Ericgone 这里只是抛出了警告:目前版本会隐式将0-D tensor转换为1-D tensor,2.6以后的版本将会直接抛出错误。实际并不影响训练过程。 可以通过export FLAGS_set_to_1d=False来忽略警告

这个警告会把你的训练log日志占满,可以控制它只出现一次吗?

  1. 尝试使用 export FLAGS_set_to_1d=False
  2. 更换更低版本的paddle

方法一:Failed, NCCL error ../paddle/fluid/distributed/collective/process_group_nccl.cc:660 'internal error' LAUNCH INFO 2023-08-30 13:52:47,376 Exit code 1 方法二,从paddle2.5.1降级到2.4.2有效,感谢。

TerryBryant avatar Aug 30 '23 13:08 TerryBryant

@Ericgone 这里只是抛出了警告:目前版本会隐式将0-D tensor转换为1-D tensor,2.6以后的版本将会直接抛出错误。实际并不影响训练过程。

可以通过export FLAGS_set_to_1d=False来忽略警告

我使用paddleseg训练自己的数据集遇到同样的问题(我使用官方提供的PaddleSeg-release-2.8进行训练,除了数据集配置并无其他代码修改), 但是这个命令并没有用,另外在aistudio中将paddle2.5改为2.4也无法解决这个问题 image

henryccl avatar Sep 20 '23 03:09 henryccl

@henryccl 可以试试使用PaddleSeg-release-2.8.1版本

KKWY0909 avatar Sep 20 '23 09:09 KKWY0909

@henryccl 可以试试使用PaddleSeg-release-2.8.1版本

感谢!2.8.1确实没报错了。

henryccl avatar Sep 22 '23 06:09 henryccl

請問,2.8.1去哪安裝? 我怎裝都是2.8.0...

jason660519 avatar Sep 24 '23 02:09 jason660519