PaddleSeg [General Issue] Problem when training the PaddleSeg Matting Model

Enviroment

PaddleSeg version: release/2.5
PaddlePaddle version: PaddlePaddle 2.3.1 (gpu)
Operation system: Linux
Python version: Python 3.7
CUDA/cuDNN version: CUDA11.1/cuDNN 7.6
Using Google Colab

Issue

I tried to train the PaddleSeg Matting Model on a custom dataset. The training went on normally until the 1000th iteration where I got the following error :

2022-07-06 21:11:16 [INFO]	[TRAIN] epoch=167, iter=1000/100000, loss=1.0582, lr=0.020000, batch_cost=2.3336, reader_cost=1.07924, ips=6.8565 samples/sec | ETA 64:10:22
2022-07-06 21:11:16 [INFO]	[TRAIN] [LOSS] all=1.0582 semantic=0.0198 detail=0.9192 fusion=0.1192 fusion_l1=0.0808 fusion_comp=0.0347 fusion_con_sem=0.0037
Traceback (most recent call last):
  File "train.py", line 174, in <module>
    main(args)
  File "train.py", line 169, in main
    eval_begin_iters=args.eval_begin_iters)
  File "/content/PaddleSeg/Matting/core/train.py", line 223, in train
    log_writer=log_writer, vis_dict=vis_dict, step=iter)
  File "/content/PaddleSeg/Matting/core/train.py", line 54, in visual_in_traning
    log_writer.add_image(tag=key, img=value, step=step)
  File "/usr/local/lib/python3.7/dist-packages/visualdl/writer/writer.py", line 217, in add_image
    dataformats=dataformats))
  File "/usr/local/lib/python3.7/dist-packages/visualdl/component/base_component.py", line 171, in image
    image_bytes = imgarray2bytes(image_array)
  File "/usr/local/lib/python3.7/dist-packages/visualdl/component/base_component.py", line 74, in imgarray2bytes
    img_bin = Image.fromarray(np.uint8(buf)).tobytes("raw")
  File "/usr/local/lib/python3.7/dist-packages/PIL/Image.py", line 2728, in fromarray
    size = shape[1], shape[0]
IndexError: tuple index out of range

Can someone help me out with this issue please ? Thanks 😃

Jul 07 '22 12:07 omarsemma

可以根据教程先跑一下流程试试，如果教程能跑通说明是数据的问题哈

Jul 08 '22 02:07 wuyefeilin

Thanks for your reply,

I've just tried it on the PPM-100 dataset and I still get the same error on the 1000th iteration.

Jul 08 '22 09:07 omarsemma

What about develop branch. Does it have the same problem?

Jul 12 '22 02:07 wuyefeilin

Got exactly the same error with the develop branch. I also tried training without using the VisualDL(v 2.3.0) argument and it worked.

Jul 12 '22 16:07 omarsemma

I can not repeat you problem. Do you change the code.

Aug 11 '22 10:08 wuyefeilin

What about develop branch. Does it have the same problem?

您好，我遇到了同样的问题，使用教程中提供的 PPM-100 数据集训练时在 iter=1000 报错，经过排查这个问题出现在 vdl 过程中 (Matting/ppmatting/core/train.py 255 行)，调用多层进行至 imgarray2bytes 方法时，图中第 94 行导致图片的shape由 (512， 512，3）变为（306912，），此后 fromarray 方法中调用 shape[1] 时出错，调用堆栈可参考下图，希望能得到您的关注，谢谢~

Feb 07 '23 02:02 unihornWwan

PaddleSeg PaddleSeg copied to clipboard

[General Issue] Problem when training the PaddleSeg Matting Model

Enviroment

Issue

PaddleSeg
PaddleSeg copied to clipboard