cleargrasp Problem training pytorch

Problem training pytorch_networks/masks subnet

Open gajdosech2 opened this issue 1 year ago • 0 comments

Hello,

first off, thank you very much for your work. I tried retraining the segmentation network for higher resolution (800x600), however, I have failed to make the training work, even for the default 256x256 resolution. Specifically, I have downloaded your training dataset and tried to continue training from your checkpoint_mask.pth. However, these are the results after the first epoch:

Train Epoch Loss: 11751.5604 Train mIoU: 0.0000

Validation Epoch Loss: 704898596864.0000 Validation mIoU: 0.0000

Test Synthetic Epoch Loss: 3180020034638.7690 Test Synthetic mIoU: 0.0000

Visually, the network produces just black frames, whereas your trained checkpoint works as expected. I have tried a similar approach for the surface normal prediction network (again using your pre-trained weights and just increasing the resolution) and there it worked fine.

I believe there is some problem with loss function / training code for this mask network, but I have failed to pinpoint it in the code.

I would appreciate any help, thank you.

Jul 03 '23 07:07 gajdosech2

cleargrasp cleargrasp copied to clipboard

Problem training pytorch_networks/masks subnet

cleargrasp
cleargrasp copied to clipboard