mtcnn icon indicating copy to clipboard operation
mtcnn copied to clipboard

Training Issue

Open JoeHEZHAO opened this issue 8 years ago • 5 comments

Hello Lin

I have a couple of questions about training the network with data generated by gen_pnet_data.py

Noted that the data/mtcnn/imglists/train_12.txt mix both positive and negative images and their groundtruth, I am wondering how do you deal with negative bounding box, which is 0 ?

For example, the regression result is [0.1, 0.2, 0.3, 0.4] and negative bbx ground truth is [0]. Should I make it [0,0,0,0] ? Or should I training bbx only with positive and parts data ? However, since we are training the classification and bbx at the same time, I am guessing we should training all data at the same time, right ?

Best HZ

JoeHEZHAO avatar Oct 30 '17 01:10 JoeHEZHAO

@JoeHEZHAO For bbox regression task, the gradients of negative examples will be set to 0 in backward, https://github.com/Seanlinx/mtcnn/blob/master/core/negativemining.py#L53 so their ground truth can be any value you like. And it's set to [0,0,0,0] in this code. https://github.com/Seanlinx/mtcnn/blob/master/core/imdb.py#L127

Seanlinx avatar Oct 30 '17 08:10 Seanlinx

@Seanlinx Thanks so much for replying. Please allow me rephrase your words, just to make sure I understand right.

For classification, we are using label 0 and 1 to find valid index and calculate the loss. And for bbx regression, we are using label -1 and 1 to get valid index and the loss. Am I correct ?

According to the paper, the total loss would be cls_loss + 0.5 * bbx_loss ?

JoeHEZHAO avatar Oct 31 '17 23:10 JoeHEZHAO

@JoeHEZHAO Yes, you're right. I didn't follow the ratio of losses provided in the paper since my implication doesn't have landmark task included. I simply assign equal importance to cls and bbox tasks, so that's cls_loss + bbox_loss. You can alter the value of grad_scale in 'bbox_pred' layer to set the importance of bbox task.

Seanlinx avatar Nov 01 '17 02:11 Seanlinx

@Seanlinx Thanks so much for your help. But another problem just come out. When I train the pnet, the loss for bounding box regression would explode, such as up to 100000, if I don't use BatchNormalization after PRelu layer. May I ask, how did you calculate the loss for bounding box regression? I am using MSE for target_bbx (size [batch, 4] ) and pred_bbx [batch, 4].

JoeHEZHAO avatar Nov 02 '17 17:11 JoeHEZHAO

Sorry for the bother. Previous question has been solved by normalize input images with ImageNet's paremeter (mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)). I could get reasonable loss now. Seems like ConvNet works better on data in (0, 1) range.

However, when I train classification and regression together, the loss would never go down to 0.22 and 0.015 as your code trained. Though when I train them separately, I can get low loss as yours.

May I ask again, are you just adding two loss together then do back propagation ? Is there else should be done for loss calculating ?

JoeHEZHAO avatar Nov 03 '17 00:11 JoeHEZHAO