pysot
pysot copied to clipboard
Shape of pred & label is difference when training siamrpn_alex_dwxcorr/config.yaml
Hello every one, I had trained siamrpn_r50_l234_dwxcorr/config.yaml with my PC successfully. But when I train with siamrpn_alex_dwxcorr/config.yaml, it always raises Error during computing loss function.
After tracing, in the following function, I found that shape of pred & label is difference, pred is (17x17) and label is (25x25). which is not the case for Alex Backbone. Should I change any other configuration before training with "--cfg=(path)/siamrpn_alex_dwxcorr/config.yaml" ? Thank you
def select_cross_entropy_loss(pred, label): pred = pred.view(-1, 2) label = label.view(-1) pos = label.data.eq(1).nonzero().squeeze().cuda() neg = label.data.eq(0).nonzero().squeeze().cuda() loss_pos = get_cls_loss(pred, label, pos) loss_neg = get_cls_loss(pred, label, neg) return loss_pos * 0.5 + loss_neg * 0.5
You can use siamrpn_alex_dwxcorr_16gpu/config.yaml instead.
The error occurred because cfg.TRAIN.OUTPUT_SIZE remained 25 as defined in pysot/core/config.py, which should have been 17 in the AlexNet case. siamrpn_alex_dwxcorr_16gpu/config.yaml overwrites that kind of hyperparameters for training the AlexNet-based CNN but siamrpn_alex_dwxcorr/config.yaml does not.
@kumatheworld Thanks for you help, I think the problem is really due to the wrong setting of cfg.TRAIN.OUTPUT_SIZE.
I can train the network, if I manually set cfg.TRAIN.OUTPUT_SIZE=17 now, however, still some problem exists.
I have made sure cfg.TRAIN.OUTPUT_SIZE was 17 before entering the for loop as below:
However, it always becomes 25 again and again in getitem(...), even though I can set it back as follow, I don't know why this would happen?
OK, I see. Since dataset.py did not merge from config.yaml at all. I think this is a small bug can be solved.
You're more than welcome to submit a merge request