PaddleSeg
PaddleSeg copied to clipboard
[General Issue] Problem when training the PaddleSeg Matting Model
Enviroment
- PaddleSeg version: release/2.5
- PaddlePaddle version: PaddlePaddle 2.3.1 (gpu)
- Operation system: Linux
- Python version: Python 3.7
- CUDA/cuDNN version: CUDA11.1/cuDNN 7.6
- Using Google Colab
Issue
I tried to train the PaddleSeg Matting Model on a custom dataset. The training went on normally until the 1000th iteration where I got the following error :
2022-07-06 21:11:16 [INFO] [TRAIN] epoch=167, iter=1000/100000, loss=1.0582, lr=0.020000, batch_cost=2.3336, reader_cost=1.07924, ips=6.8565 samples/sec | ETA 64:10:22
2022-07-06 21:11:16 [INFO] [TRAIN] [LOSS] all=1.0582 semantic=0.0198 detail=0.9192 fusion=0.1192 fusion_l1=0.0808 fusion_comp=0.0347 fusion_con_sem=0.0037
Traceback (most recent call last):
File "train.py", line 174, in <module>
main(args)
File "train.py", line 169, in main
eval_begin_iters=args.eval_begin_iters)
File "/content/PaddleSeg/Matting/core/train.py", line 223, in train
log_writer=log_writer, vis_dict=vis_dict, step=iter)
File "/content/PaddleSeg/Matting/core/train.py", line 54, in visual_in_traning
log_writer.add_image(tag=key, img=value, step=step)
File "/usr/local/lib/python3.7/dist-packages/visualdl/writer/writer.py", line 217, in add_image
dataformats=dataformats))
File "/usr/local/lib/python3.7/dist-packages/visualdl/component/base_component.py", line 171, in image
image_bytes = imgarray2bytes(image_array)
File "/usr/local/lib/python3.7/dist-packages/visualdl/component/base_component.py", line 74, in imgarray2bytes
img_bin = Image.fromarray(np.uint8(buf)).tobytes("raw")
File "/usr/local/lib/python3.7/dist-packages/PIL/Image.py", line 2728, in fromarray
size = shape[1], shape[0]
IndexError: tuple index out of range
Can someone help me out with this issue please ? Thanks 😃
可以根据教程先跑一下流程试试,如果教程能跑通说明是数据的问题哈
Thanks for your reply,
I've just tried it on the PPM-100 dataset and I still get the same error on the 1000th iteration.
What about develop branch. Does it have the same problem?
Got exactly the same error with the develop branch. I also tried training without using the VisualDL(v 2.3.0) argument and it worked.
I can not repeat you problem. Do you change the code.
What about develop branch. Does it have the same problem?
您好,我遇到了同样的问题,使用教程中提供的 PPM-100 数据集训练时在 iter=1000 报错,经过排查这个问题出现在 vdl 过程中 (Matting/ppmatting/core/train.py 255 行),调用多层进行至 imgarray2bytes 方法时,图中第 94 行导致图片的shape由 (512, 512,3)变为 (306912,), 此后 fromarray 方法中调用 shape[1] 时出错,调用堆栈可参考下图,希望能得到您的关注,谢谢~