PSPNet icon indicating copy to clipboard operation
PSPNet copied to clipboard

Training PSPNet

Open DonghyunK opened this issue 8 years ago • 10 comments

Hi,

I am trying to train PSPNet50.

With a Titan X, I could only train it with the batch size of 1. I cannot train it with the batch size of 2 even using 2 Titan GPUs.

Could please let me know how many gpus you used to train PSPNet and how many gpus are needed to train PSPNet successfully?

Thank you so much.

DonghyunK avatar Feb 08 '17 00:02 DonghyunK

The experiments with ResNet101 are trained with 4 GPUs. Actually our training code contains many memory optimization so it needs less memory.

wistone avatar Feb 09 '17 03:02 wistone

@wistone Thank you for the reply.

Could you please let me know how do you optimize memory??

Thank you.

DonghyunK avatar Feb 09 '17 03:02 DonghyunK

https://github.com/dmlc/mxnet-memonger

You can refer to the memory optimization in MXNet

wistone avatar Feb 09 '17 03:02 wistone

@wistone

Did you train a model using MXNet and then convert the MXNet model into Caffe model??

Thank you.

DonghyunK avatar Feb 09 '17 03:02 DonghyunK

We implement this method in our training platform in caffe.

wistone avatar Feb 09 '17 03:02 wistone

@wistone

Is it publicly available?

If not, could you please let me know how you implement this method in Caffe?

Thank you

DonghyunK avatar Feb 09 '17 03:02 DonghyunK

No it is not public. You can read the paper for the theory, and refer the code in MXNet. It is doable.

wistone avatar Feb 09 '17 04:02 wistone

@hszhao @wistone Could you explain what's mean of the three accuracy output on training phase? There only one SegAccuracy, so I think there should one accuracy output, why three ones occurs? example output: I0605 10:56:09.200160 16167 solver.cpp:245] Train net output #0: accuracy = 0.971667 I0605 10:56:09.200170 16167 solver.cpp:245] Train net output #1: accuracy = 0.873694 I0605 10:56:09.200176 16167 solver.cpp:245] Train net output #2: accuracy = 0.803811

LearnerInGithub avatar Jun 05 '17 02:06 LearnerInGithub

Wait, @LearnerInGithub , where did you get training files?!

balloch avatar Jun 22 '17 05:06 balloch

@DonghyunK Can you public script for training?

ThienAnh avatar Oct 05 '17 09:10 ThienAnh