PyTorch_YOLOv4 Can you please also include the results of speed vs mAP accuracy and compare it with original YOLOv4 implementation? The current results only include mAP accuracy.

Jul 17 '20 23:07 BipinJG

Hello, do you mean compare the speed on darknet and pytorch?

Jul 19 '20 03:07 WongKinYiu

Great work @WongKinYiu . Glad to see you make this repo (I had been following your discussion with Alexey when you were doing all your comparisons between your and his V4 work and the Ultralytics "v5" release).

I suspect the OP is asking at the very least about fps info when these networks are run in the native Pytorch implementation in this repo on whatever GPU you are using. I am also curious to see those numbers as a part of your table in the Readme.

Cheers

Jul 20 '20 01:07 HamsterHuey

Hello, do you mean compare the speed on darknet and pytorch?

@WongKinYiu Yes. I was just curious about the speed and mAP comparison of original darknet implementation by AlexeyAB vs your pytorch implementation.

Jul 21 '20 22:07 BipinJG

If use Leaky activation function, Pytorch gets about 5 FPS slower than Darknet when input size is 608x608. If use Mish activation function, Pytorch gets about 30% slower than Darknet.

Jul 21 '20 23:07 WongKinYiu

@WongKinYiu - Is that difference in speed primarily because those activation functions are in Python and not custom C / CUDA implementations compared to darknet implementations?

Jul 22 '20 13:07 HamsterHuey

@WongKinYiu why not use this - https://github.com/thomasbrandon/mish-cuda for PyTorch

Jul 22 '20 20:07 digantamisra98

@digantamisra98

thanks, i will take a look it.

Jul 22 '20 23:07 WongKinYiu

@digantamisra98

thanks, now it almost as fast as leaky model (+6~10% inference time).

before:

Speed: 6.7/1.0/7.7 ms inference/NMS/total per 736x736 image at batch-size 8

after:

 Speed: 5.3/0.9/6.1 ms inference/NMS/total per 736x736 image at batch-size 8

AP drops a little bit, maybe I will re-train the model with this mish implementation.

Jul 23 '20 07:07 WongKinYiu

Great, what's the precision you're using (fp32/fp16)? Also, what's the drop in AP? Usually the faster implementation mirrors the learning performance as that of the original non optimized version.

Jul 23 '20 11:07 digantamisra98

@digantamisra98

I use fp16, and it drops ~0.01% AP.

Jul 23 '20 12:07 WongKinYiu

I don't think that's a significant level of drop. So you basically changed old Mish to this new version in the inference only with the already trained model on old Mish?

Jul 23 '20 12:07 digantamisra98

yes, just use new Mish to replace old Mish.

from mish_cuda import MishCuda as Mish

Jul 23 '20 13:07 WongKinYiu

@WongKinYiu - Do you have an idea or guess on where the performance is dropping between the Darknet implementation and the Pytorch implementation? Usually in the past, I have seen that similar models in Pytorch were a bit faster than their Darknet counterpart. I haven't kept up with the darknet repo development but I was curious about where this performance difference may be coming from.

Jul 23 '20 14:07 HamsterHuey

cudnn api.

Jul 23 '20 14:07 WongKinYiu

@WongKinYiu Thanks for the great work. Just tested Alexey original darknet implementation, and it seems to have a much higher mAP. Any idea why?

Jul 31 '20 19:07 edwin323

original yolov4 is trained on coco2014 setting, val5k of coco17 has some overlap with train of coco2014. if you want to compare the mAP, please use test-dev set.

Jul 31 '20 22:07 WongKinYiu

@digantamisra98 hello, i update results of mish models, you could find them in readme.

Sep 10 '21 03:09 WongKinYiu

Thank you, checking it out now!

Sep 10 '21 03:09 digantamisra98