PSPNet-Keras-tensorflow icon indicating copy to clipboard operation
PSPNet-Keras-tensorflow copied to clipboard

Anyway to train ?

Open yijiezh opened this issue 8 years ago • 24 comments

How to train with given data set?

yijiezh avatar Aug 07 '17 21:08 yijiezh

@EvanzzzZ Sorry, we currently have not figured out a good way to train the code. Basically, training skills are the key points and source of good performance of PSPNET. Because of some NDA reasons, the author does not share the train the training code. You can try to use OpenMPI for training the PSPNET on several GPUs to get good performance, but I am not sure the difficulty about this work.

wcy940418 avatar Aug 07 '17 21:08 wcy940418

In this repo, you convert the caffe model to tensorflow and use the trained params, and then didn't get the similar results ?

yijiezh avatar Aug 07 '17 22:08 yijiezh

@EvanzzzZ Yes, this may due to something different between caffe and tensorflow, but I still don't know why

wcy940418 avatar Aug 07 '17 23:08 wcy940418

How can you verify your layers_builder build the exact model as in the paper.

yijiezh avatar Aug 07 '17 23:08 yijiezh

Since you already have the network, why can't you train the model from scratch ?

yijiezh avatar Aug 07 '17 23:08 yijiezh

@EvanzzzZ If the explicit parameters(e.g. weight, bias) are not identical to the model in the paper, it can not be assigned to our network, since all the parameters are ported from original work. As for internal architecture in network, we just use the common layer setting in keras. If the caffe code uses some "magical" layer and some hidden settings, as least I have no idea about how to do it.

The PSPNET depends on some magic optimization training skills heavily, and the batch size they used is impossible for single card training by normal training function. You know that bigger batch size plus batch normalization will bring very significant promotion, but it is hard to perform batch normalization on multi GPUs, at least I haven't done this before.

wcy940418 avatar Aug 08 '17 01:08 wcy940418

Intersting to know this. Is there any framework that support BN on multi-GPU currently ?

yijiezh avatar Aug 08 '17 03:08 yijiezh

@EvanzzzZ I only know OpenMPI can help to do it.

wcy940418 avatar Aug 08 '17 03:08 wcy940418

Is there some way to load resnet pre-trained on imagenet weights? How many time it spends to train my custom model on single 8gb GPU?

mrlzla avatar Aug 30 '17 15:08 mrlzla

So in this case, fine-tuning the model is also not possible?

xg1990 avatar Dec 05 '17 23:12 xg1990

Hi Guys, I tried to train the keras code from the scratch, and got really good performance for cityscapes data (coarse and fine annotation). ~88% for validation accuracy for pixel level (not IoU), and ~85% of testing (not IoU). I use val data in cityscapes dataset as testing (500 images) and using 2970 data for training and validation (80% training and 20% val). However, I got some issue with image dimension. Is there any possibility to use same aspect ratio with cityscapes data?

nurhadiyatna avatar Jan 23 '18 13:01 nurhadiyatna

@nurhadiyatna can you please post your full result for the validation set? Afaik the original paper only trains on quadratic patches and stitches their quadratic predictions

jmtatsch avatar Jan 23 '18 14:01 jmtatsch

aha, Ok i see then. That's why in this repo multiscale and slice prediction provided.

Sure I will try to provide some image to show the result.

Btw, While I use the layer builder I need to reshape the dimension from the last layer (None,713,713,Num_classes) to (None,508369,Num_classes), could someone explain to me why we need to do that? Because I am a newbie in this field. Thank you so much.

nurhadiyatna avatar Jan 23 '18 14:01 nurhadiyatna

@nurhadiyatna any update on the results of a self-trained psp-net?

jmtatsch avatar Feb 01 '18 16:02 jmtatsch

@jmtatsch I got an overfitting training I guess, here some log of my training process. I use both coarse and fine annotation :

overfit

Accuracy:

over_fitting_acc

btw, I need to split the data while loading the dataset, it's like 10 splits of 2970, so this training process actually 10 times of 100 epochs. ADAM used with lr:0,001. I only have pixel accuracy, and IoU still in progress. However, compared with Segnet that used earlier, this is far outperformed it. The final testing using 500 val dataset is like ~86% of accuracy.

Here is My latest result with PSPNet using SGD: 0,001 pred

Loss :

sgd_loss1

Acc :

sgd_acc1

I still have a problem doing sliced prediction since I change a bit the last layer in PSPNet. Can you help me to figure out how to solve that @jmtatsch ?

nurhadiyatna avatar Feb 05 '18 16:02 nurhadiyatna

Hi, @nurhadiyatna Can you share your train code ? Thanks .

shipengai avatar Feb 24 '18 09:02 shipengai

Hi, @nurhadiyatna Did you train PSPnet on Ade20K dataset ? Thanks .

shipengai avatar Feb 24 '18 10:02 shipengai

Hi @shipeng-uestc ... No I didn't. Only in pascal VOC 2010 and Cityscapes, however the result still far below the original paper....

nurhadiyatna avatar Mar 14 '18 12:03 nurhadiyatna

i didn't find the loss fucntion

horizonheart avatar Sep 27 '18 08:09 horizonheart

Well, I'm confused that how to use "train.py", I check the "/pyhton_utils/preprocessing.py", and don't know the default training process uses which datasets. According to the "train.py " '86 parser.add_argument('-m', '--model', type=str, default='pspnet50_ade20k', ',I guess the default is ade20k, and I add the datasets path to "train.py". After that, I run "train.py", and got the message "Pooling parameters for input shape (640, 480) are not defined.". I think the reason is "train.py " '97 train(args.datadir, args.logdir, (640, 480), args.classes, args.resnet_layers,', which in "layers_builder.py", '195 input_shape== (640, 480)", so it will "print("Pooling parameters for input shape ", input_shape, " are not defined.") exit(1)" emmm..... how to train this code in common datasets(such as voc 2012, cityscape and ade20k)? @nurhadiyatna @jmtatsch @wcy940418

zl535320706 avatar Oct 11 '18 08:10 zl535320706

HI @nurhadiyatna can you share your training code on Cityscapes dataset? the training code in this project is quit confused.... Thanks.

world4jason avatar Nov 25 '18 22:11 world4jason

@world4jason indeed, there are many errors in this training code, quite confused, i think anyone who intend to train should modify the preprocessing.py, and make sure the compatibility of python2 and python3 of train.py

sainttelant avatar Nov 26 '18 03:11 sainttelant

i've already rewritten the train.py and some relevant *.py codes, it could run the training, however, the batch_size must be set to 1, otherwise , the resources of hardware will be collapsed

sainttelant avatar Nov 26 '18 03:11 sainttelant

@sainttelant can you kindly share your codes or project? thanks

world4jason avatar Dec 07 '18 16:12 world4jason