pytorch-deeplab-xception icon indicating copy to clipboard operation
pytorch-deeplab-xception copied to clipboard

achieved 79.11% mIoU with Resnet101 backbone

Open Nadavc220 opened this issue 4 years ago • 5 comments

First of all, thanks for creating this repo. It is very clear and easier to use than the original tensorflow model. I ran a few training experiments on VOC2012 and saw many people trying to train the network and achieve a good result as the uploaded checkpoints (78.43%). Like many I felt there is no clear answer for those of us who uses one gpu and have a memory problem using a batch size of 16. I wanted to share the script parameters to help other people in case they found it hard to train the network above 78%.

my mIoU maxed at 79.1% around epoch 45 with the following call: python train.py --backbone resnet --lr 0.0035 --workers 14 --epochs 50 --batch-size 8 --gpu-ids 0 --checkname deeplab-resnet --eval-interval 1 --dataset pascal

Nadavc220 avatar May 28 '20 18:05 Nadavc220

@Nadavc220 --workers 14, is it a typo?

goldhuang avatar Jun 05 '20 22:06 goldhuang

Weirdly enough this is not a typo. I did not intend to run so many workers at all but I guess I did it by mistake when I ran the script. There is no reason to use 14 workers.

Nadavc220 avatar Jun 06 '20 05:06 Nadavc220

Yes I did. apart from the changes mentioned in the post everything was set as the default value.

On Sun, Jul 5, 2020 at 7:42 PM htwang14 [email protected] wrote:

Did you use SBD dataset to achieve this high mIoU?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jfzhang95/pytorch-deeplab-xception/issues/172#issuecomment-653910605, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHCNQ3IVA2XDRCAH6AQQX7TR2CUN3ANCNFSM4NNKDAEA .

Nadavc220 avatar Jul 05 '20 17:07 Nadavc220

Thanks!

htwang14 avatar Jul 05 '20 22:07 htwang14

First of all, thanks for creating this repo. It is very clear and easier to use than the original tensorflow model. I ran a few training experiments on VOC2012 and saw many people trying to train the network and achieve a good result as the uploaded checkpoints (78.43%). Like many I felt there is no clear answer for those of us who uses one gpu and have a memory problem using a batch size of 16. I wanted to share the script parameters to help other people in case they found it hard to train the network above 78%.

my mIoU maxed at 79.1% around epoch 45 with the following call: python train.py --backbone resnet --lr 0.0035 --workers 14 --epochs 50 --batch-size 8 --gpu-ids 0 --checkname deeplab-resnet --eval-interval 1 --dataset pascal

excuse me, how to use the pascal 2012 dataset to train the model?

tiesanguaixia avatar Aug 04 '21 13:08 tiesanguaixia