Can you please also include the results of speed vs mAP accuracy and compare it with original YOLOv4 implementation? The current results only include mAP accuracy.
Hello, do you mean compare the speed on darknet and pytorch?
Great work @WongKinYiu . Glad to see you make this repo (I had been following your discussion with Alexey when you were doing all your comparisons between your and his V4 work and the Ultralytics "v5" release).
I suspect the OP is asking at the very least about fps info when these networks are run in the native Pytorch implementation in this repo on whatever GPU you are using. I am also curious to see those numbers as a part of your table in the Readme.
Cheers
Hello, do you mean compare the speed on darknet and pytorch?
@WongKinYiu Yes. I was just curious about the speed and mAP comparison of original darknet implementation by AlexeyAB vs your pytorch implementation.
If use Leaky activation function, Pytorch gets about 5 FPS slower than Darknet when input size is 608x608. If use Mish activation function, Pytorch gets about 30% slower than Darknet.
@WongKinYiu - Is that difference in speed primarily because those activation functions are in Python and not custom C / CUDA implementations compared to darknet implementations?
@WongKinYiu why not use this - https://github.com/thomasbrandon/mish-cuda for PyTorch
@digantamisra98
thanks, i will take a look it.
@digantamisra98
thanks, now it almost as fast as leaky model (+6~10% inference time).
before:
Speed: 6.7/1.0/7.7 ms inference/NMS/total per 736x736 image at batch-size 8
after:
Speed: 5.3/0.9/6.1 ms inference/NMS/total per 736x736 image at batch-size 8
AP drops a little bit, maybe I will re-train the model with this mish implementation.
Great, what's the precision you're using (fp32/fp16)? Also, what's the drop in AP? Usually the faster implementation mirrors the learning performance as that of the original non optimized version.
@digantamisra98
I use fp16, and it drops ~0.01% AP.
I don't think that's a significant level of drop. So you basically changed old Mish to this new version in the inference only with the already trained model on old Mish?
yes, just use new Mish to replace old Mish.
from mish_cuda import MishCuda as Mish
@WongKinYiu - Do you have an idea or guess on where the performance is dropping between the Darknet implementation and the Pytorch implementation? Usually in the past, I have seen that similar models in Pytorch were a bit faster than their Darknet counterpart. I haven't kept up with the darknet repo development but I was curious about where this performance difference may be coming from.
cudnn api.
@WongKinYiu Thanks for the great work. Just tested Alexey original darknet implementation, and it seems to have a much higher mAP. Any idea why?
original yolov4 is trained on coco2014 setting, val5k of coco17 has some overlap with train of coco2014. if you want to compare the mAP, please use test-dev set.
@digantamisra98 hello, i update results of mish models, you could find them in readme.
Thank you, checking it out now!