yolo2-pytorch icon indicating copy to clipboard operation
yolo2-pytorch copied to clipboard

mAP

Open GOATmessi7 opened this issue 7 years ago • 32 comments

Have you ever evaluate the transformed trained model in VOC2007? I've tried your code and got a 71.9 mAP while the original is 76.8. Then I found a tiny error in test code, after fixing the result up to 72.8 mAP, still not enough...

GOATmessi7 avatar Mar 04 '17 12:03 GOATmessi7

Yes, I got the same result. You can make a pull request for me to fix the bug. I have no idea about the low mAP of my implementation. Did you try darknet implemented by the author?

longcw avatar Mar 04 '17 12:03 longcw

Not yet, but I found an issue in darkflow, it seems the transfer to tensorflow also cause some difference. https://github.com/thtrieu/darkflow/issues/25 I will make a pull request if I figure out the training part. Maybe that could solve the problem...

GOATmessi7 avatar Mar 04 '17 12:03 GOATmessi7

I implemented the loss function following the darknet and the training process is work now. I trained it on VOC2007 trainval set and got a ~71.86 mAP~ ~50mAP on the test set. Maybe you can find out some other problems about the low mAP with the help of darknet source code.

longcw avatar Mar 07 '17 01:03 longcw

@longcw Thank you for sharing code. I have tested the converted darknet model, which got ~72 mAP. Then I trained VOC07 trainval set for 160 epoch (totally use your github codes), which only got ~50 mAP. Did you successfully train the yolo2 detector?

terrychenism avatar Mar 07 '17 22:03 terrychenism

Thank you for your comment. I tested the trained model and got the same result, ~50mAP. There are still some bugs for training. I am sorry for this.

longcw avatar Mar 08 '17 01:03 longcw

For test phase, there are two parameters inconsistent with the original darknet:

  • The thresh parameter for bbox filtering is 0.001 in darknet, while it is 0.01 in test.py;
  • The iou_thresh for nms is 0.5 in darknet, while it is 0.3 in this project. For train phase, the thresh in cfgs/config.py should be 0.24, instead of 0.3;

As @ruinmessi , before correcting those parameters, the mAP in VOC2007-test is 71.9. Correction of first parameter improves slightly to 72.2, and correction the iou_thresh further boosts to 73.6. The tensorflow version of yolo (darkflow) seems to suffer such a problem too, and an issue of that project pointed out some possible reasons. Maybe the reasons exist also in this project?

crazylyf avatar Mar 14 '17 12:03 crazylyf

@ruinmessi What error in test code have you fixed?

crazylyf avatar Mar 14 '17 12:03 crazylyf

@longcw @crazylyf Sorry for leaving a long time. I boost the mAP to 74.3 by changing the nms order like this while this project do the nms in a function called postprocess. with the exact parameters you mentioned.

GOATmessi7 avatar Mar 15 '17 13:03 GOATmessi7

Why your mAP is 0.7 higher if we are using the same parameters? Am I missing something?

crazylyf avatar Mar 15 '17 13:03 crazylyf

The nms should implement before thresh holding.

GOATmessi7 avatar Mar 15 '17 13:03 GOATmessi7

@ruinmessi Thank you for pointing out this problem.

longcw avatar Mar 15 '17 13:03 longcw

@longcw I am curious about how to convert the original weights to h5 file, could you please show me some details or scripts?

GOATmessi7 avatar Mar 16 '17 10:03 GOATmessi7

@ruinmessi I use darkflow to load original weights from the binary weights file.

longcw avatar Mar 16 '17 15:03 longcw

Is there any update on the training issue?

rdfong avatar Apr 14 '17 09:04 rdfong

@ruinmessi Does the order of NMS and thresh holding affect the results? I don't think so..Can anyone prove I am wrong?

jxgu1016 avatar Apr 27 '17 11:04 jxgu1016

Perhaps the weights of the convolutional layers needs to be held fixed while training on the VOC datasets?

rdfong avatar Apr 28 '17 09:04 rdfong

In darknet19_448.cfg from the darknet project, batch size is 128, not 16 as it is in the config files here. Unfortunately I do not have the resources to test with a full batch size of 128. With 16 though I can confirm that I only get ~50 mAP. Can someone else try to confirm whether or not changing the batch size makes a difference? It's the only parameter I can find that differs between the two projects.

rdfong avatar Apr 28 '17 14:04 rdfong

I slightly change this code (following original YOLO training procedure), and train 160 epoch on VOC07+12, test on VOC07-test, evaluated mAP with 416 x 416 resolution 0.6334, batch size 16 (trained by me) 0.6446, batch size 32 (trained by me)

0.7221, batch size 64 (directly test by using the weight provided by @longcw (yolo-voc.weights.h5) 0.768 , batch size 64 (claimed by paper, not trained by me)

Revise this code seems necessary if you want to train with such large batch size (64) It need to work on multi-GPU. ( split a large batch to smaller to fit into single GPU memory)

I think there is still something mismatched, so mAP drops largely.

cory8249 avatar May 01 '17 05:05 cory8249

I have implemented YOLOv2 in tensorflow. But I can achieve an mAP of about only 0.60 on VOC07-test (train with VOC07+12 train+val), with all the tricks except "hi-res detector" in Table 2 in the paper implemented. @cory8249 Could you kindly share your code which achieves 0.768 mAP? Thanks!!

JesseYang avatar May 06 '17 13:05 JesseYang

@JesseYang Sorry to let you misunderstand, 0.768 mAP is not trained by me. I just mention it as reference.

cory8249 avatar May 06 '17 13:05 cory8249

@cory8249 I see. Thanks!

JesseYang avatar May 06 '17 14:05 JesseYang

I fix the IoU bug, and train on VOC0712 trainval. Get mAP = 0.6825 (still increase slowly) https://github.com/cory8249/yolo2-pytorch/blob/master/darknet.py#L120

cory8249 avatar May 09 '17 14:05 cory8249

@cory8249 Have you fixed another issue when you got the 0.6825 mAP?

JesseYang avatar May 10 '17 04:05 JesseYang

@JesseYang I think I've fix these exp() sig() bug in my experiment.

cory8249 avatar May 10 '17 06:05 cory8249

I also found something interesting: ver.A = pytorch anaconda prebuild version (cp36) ver.B = pytorch built from source code using native python (python35) In training phase ver.A is 2x slower than ver.B (1 sec/batch vs. 0.5 sec/batch) In test phase ver.A is 1.5x slower than ver.B (16 ms/img vs. 11ms/img)

Does anyone have this same problem ?

cory8249 avatar May 10 '17 15:05 cory8249

I've trained a model with mAP = 0.71 by fixing bug in #23

cory8249 avatar May 16 '17 03:05 cory8249

Does anyone try to train yolov1 on pascal voc(2007+2012 trainval) and surpass mAP by 60% on 2007 test?

gauss-clb avatar Jan 01 '18 11:01 gauss-clb

After modified the code mentioned here, my mAP goes to 72.1% with 416*416 input.

xuzijian avatar Apr 08 '18 02:04 xuzijian

@xuzijian what mAP do you get with VOC(2007 trainval) after the changes?

wahrheit-git avatar Apr 13 '18 03:04 wahrheit-git

@kk1153 I haven't trained models with only VOC07 dataset

xuzijian avatar Apr 14 '18 02:04 xuzijian