awesome-semantic-segmentation-pytorch icon indicating copy to clipboard operation
awesome-semantic-segmentation-pytorch copied to clipboard

Trend of mIoU

Open RAYRAYRAYRita opened this issue 6 years ago • 2 comments
trafficstars

Hello~ Thanks for your work! And there is something making me puzzled. I trained icnet on cityscapes for many times with various config. And I visualized val mIoU after each epoch during training. I see mIoU trends to raise. However even if I set epoch from 50 to 120, the line of mIoU is always like this, looking like it's going to raise. Does it imply a bigger epoch is needed? 2019-10-23 15-10-06屏幕截图

In addition, I don't know why at the beginning of train or eval, there is 50% chance that my computer crashes..... Could you please do me a favour. Thx in advance!

RAYRAYRAYRita avatar Oct 24 '19 06:10 RAYRAYRAYRita

  1. Try running the cityscapes dataset on a different model such as BiSeNet or DeepLab or PSPNet. I am sure you will get good results.

  2. There are different reasons as to why your PC crashes. It can be because either you don't have the sufficient GPU memory or there are some version issues with CUDA/CuDNN. What is your GPU? One fix is to run nvidia-smi in shell prompt and kill all those processes that use GPU memory (be careful with this. You might accidentally kill system-related processes) before starting your training. In case you are running the model from jupyter notebook, try restarting the jupyter kernel every time you run or another thing that you can do is to reduce the batch_size and start the training.

sainatarajan avatar Oct 26 '19 04:10 sainatarajan

@sainatarajan Thanks for your reply!

  1. Yes, I think you are right. I need a real time segmentation so maybe my next step is to try BiSeNet.
  2. My GPU is GeForce GTX 1080Ti. Actually I was running nvidia-smi before and during my training to monitor GPU memory. I found at the beginning of train/eval, usage of memory would grow intensely (it's exactly when my computer has 50% chance to crash. Sometimes it passed and sometimes it failed. Just one time reported 'out of memory') and then go down to the normal level. So, in order to keep memory sufficient at the beginning, I had to reduce batch_size or crop_size even though memory was enough during the training. But I'm afraid this would affect results. Thx again for your reply~

RAYRAYRAYRita avatar Oct 26 '19 09:10 RAYRAYRAYRita