darknet Object detection is very slow

I have installed Cuda & OpenCV for object detection as written in docs.

And tried to detect objects on the video file and got nearly FPS: 2.8.

./darknet detector demo cfg/coco.data cfg/yolo.cfg yolo.weights video-file.mp4

Then I tried to use cudNN, but result was FPS: 2.0.

What should I do to get higher FPS?

Environment Ubuntu 16.04 CUDA Version 8.0.61 OpenCV 3.2 GPU: GeForce 840M (2GB) Driver: NVIDIA 375.66

Jul 13 '17 22:07 isabek

@Isabek You should get about ~7 FPS on GeForce 840M

What is the resolution in your video file?

Jul 13 '17 22:07 AlexeyAB

@AlexeyAB 1920x1080. Is it big? What is the eligible resolution for darknet?

Jul 13 '17 22:07 isabek

Try to decrease width=416 and height=416 in the yolo.cfg file: https://github.com/pjreddie/darknet/blob/master/cfg/yolo.cfg#L8

Resolution 1920x1080 of video file is normanl for Yolo, but also try to use lower resolution video file.

Jul 13 '17 22:07 AlexeyAB

I have decreased width and height in the config file. Now FPS is ~5.4 which is good, but it is not enough. Btw, I tried to use lower video but result was the same.

Jul 13 '17 22:07 isabek

@AlexeyAB thank you.

Jul 13 '17 22:07 isabek

hi Isabek, I met with the same problem as you described and wonder if you've solved it?

Jul 17 '17 06:07 DennisWangCW

Hi @DennisWangCW, If you want to reach very high FPS as shown on darknet documentation you need this kind of computer.

Btw, you can train you own model. Because YOLO tries to classify a lot of objects.

P.S. I couldn't solve it. As @AlexeyAB said, I can reach only 7.5 FPS with my GPU which is not enough for me.

Jul 17 '17 10:07 isabek

@Isabek Hi,

You can use Tiny-Yolo instead of Yolo, so you can get about ~15 FPS on GeForce 840M: ./darknet detector demo cfg/coco.data cfg/tiny-yolo.cfg tiny-yolo.weights video-file.mp4

tiny-yolo.cfg: https://raw.githubusercontent.com/pjreddie/darknet/master/cfg/tiny-yolo.cfg
tiny-yolo.weight: https://pjreddie.com/media/files/tiny-yolo.weights

Also you can decrease width=288 and height=288 in the tiny-yolo.cfg so you can get about ~30 FPS on GeForce 840M.

But each such step worsens the detection accuracy.

Jul 17 '17 10:07 AlexeyAB

Thank you @AlexeyAB.

I decreased width and height to 288 in tiny-yolo.cfg and changed video file resolution to 960x540. Now FPS is ~25 which is cool. FYI @DennisWangCW

Jul 18 '17 10:07 isabek

@AlexeyAB how can I train my own model? Is it possible? I have my own image dataset.

Jul 18 '17 10:07 isabek

@Isabek Yes: https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects

Jul 18 '17 10:07 AlexeyAB

Hi @AlexeyAB. I am bit confused about absolute_x, absolute_height. Could explain them? Thanks!

Jul 18 '17 14:07 isabek

@Isabek You have an image 1920x1080, and object-1 with center (100,200) and width=50,height=20 So for this object: absolute_x = 100, absolute_y = 200, absolute_width = 50, absolute_height = 20

According to <x> = <absolute_x> / <image_width> or <height> = <absolute_height> / <image_height> Into txt file you should write: 1 0.052 0.185 0.026 0.0185

Or just use this tool: https://github.com/AlexeyAB/Yolo_mark

Jul 18 '17 15:07 AlexeyAB

@AlexeyAB I have 6 classes. How many images should I choose for each class? I selected 100 images for each class and weights file for 500 iterations weighs 256M.

Jul 19 '17 13:07 isabek

@Isabek I still don't solve the problem, but whatever, thank you.

Jul 25 '17 09:07 DennisWangCW

@Isabek Usually 500 - 2000 images for each class (object) is enough. And should be trained (2000 X number_of_classes) iterations.

Jul 25 '17 14:07 AlexeyAB

@AlexeyAB thank you. I have started training 2 days ago. It is training my model still :)

Jul 26 '17 11:07 isabek

I played around with yolo.cfg and tiny-yolo.cfg (width=416 and height=416) config files, but performance lags around 4 fps reading from a video file and writing output to another video file(avi). What type of performance should developers expect using the Nvidia TX1 platform???

Also, when I run the web cam demo, I get about 10-12 FPS with tiny-yolo and a Sony Playstation Eye. Do I need to set the camera resolution somewhere?

nvidia@tegra-ubuntu:~/darknet$ ./darknet detector demo cfg/voc.data cfg/tiny-yolo-voc.cfg weights/tiny-yolo-voc.weights

FPS:11.2

Thx.

Oct 15 '17 22:10 kaisark

I'm experiencing about same performance on TX2, with tiny yolo there is a lot of headroom on the tx2 to spawn more processes. I'm able to get ~30 fps with 4 simultaneous processes.

Oct 16 '17 09:10 TheMikeyR

@TheMikeyR Did you say 30fps on the TX2? Is that the capture rate or processing rate? What does your config/setup (hw/sf) look like for tiny yolo? Are you using Opencv 3? My understanding is that OpenCV is not very good at Video I/O (ffmpeg/gstreamer). What type of camera are you using?

Oct 18 '17 20:10 kaisark

@kaisark I'm processing offline video Using sudo jetson_clocks.sh from home directory (should be installed with JetPack) and then I'm using nvpmodel -m 2 which turns all cores on (from 4 to 6 cores) with MAXP. I didn't modify the original tiny-yolo much, I have one class to predict so I've of course modified the filters and classes.
I've compiled with CUDNN=1 GPU=1 OPENCV=1 but then I've removed the "viewing" part of the demo function, so it doesn't display the results but only prints in the terminal.
I've uncommented this line https://github.com/pjreddie/darknet/blob/c7252703420159a9f3a1ec416b1b4326c4c95402/src/demo.c#L194 to prevent OpenCV from displaying the video which speeds it up (it still show the detected objects in the terminal)
Lastly I'm opening 4 terminals and running the same command in all of them ./darknet detector demo data/rgb.data cfg/tiny-yolo.cfg ~/data/create/detection_annotation/yolo/tiny/tiny-yolo_50000.weights ~/data/create/videos/summarized_right_10min.mp4 -i 0
I believe the FPS is for the entire processing rate, since it gets called one place in demo.c and is not displayed before it gets to the same place again, so I assume it is the entire pipeline.

The camera which have been used is the Zed camera and then only right view of the rgb stream.

Oct 19 '17 06:10 TheMikeyR

@Isabek Hi, did you trained ur own model with lower class of objects? Does it improve fps?

Oct 20 '17 13:10 xhuvom

@xhuvom I have trained my own model with my collected dataset. But result is the same. FPS is 3.7. You can watch result here https://www.youtube.com/watch?v=QopUtQobWJ0

Oct 23 '17 10:10 isabek

@AlexeyAB I would like to buy a new video card. And I am little bit confused. What is the difference between MSI, Asus, EVGA and Zotac?

I am planning to buy GTX 1070. How many frames per second can I reach with GTX 1070 on YOLO?

Nov 03 '17 12:11 isabek

@Isabek primarily the cooler and some of the cards are factory overclocked (run faster). Here is a list http://thepcenthusiast.com/geforce-gtx-1070-compared-asus-evga-zotac-msi-gigabyte/ where you can filter compared to clock etc. In the end it doesn't matter much, you can also overclock the card yourself and just go with the cheaper one. It's a silicon lottery, sometimes you get a chip which can overclock a lot and other times you can't achieve anything over stock speeds.

Can't help with FPS, depends on many things.

Nov 03 '17 13:11 TheMikeyR

@Isabek You can achive about ~0.01 FPS per 1 GFlops-SP using yolo-voc.cfg network 416x416 on my fork.

Look at Single precision for your GPU: https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#GeForce_10_series

so on GeForce GTX 1070 - 6462 GFlops-SP = ~64 FPS

As the TheMikeyR said correctly, GPU of different companies are distinguished by a small overclocking of GPU and a cooling system, and they can also differ in reliability.

Nov 03 '17 13:11 AlexeyAB

I am bit confused about Precision. How can I calculate model's precision? Inside function which calculates recall and IOU I want to calculate precision. I found an answer for my question. But I am not sure.

Nov 21 '17 13:11 isabek

@AlexeyAB I need recommendation for kind/brand/model of camera to use for training,detection and recognition of multiple faces at a time from a live stream (e.g a check-in counter). Please help

Dec 05 '17 14:12 jTariq

@TheMikeyR Hey i am also working on the Jetson Tx2 and i get following Fps with sudo nvpmodel -m 0 sudo ./jetson_clocks.sh

Tiny-Yolo: 17.5 fps YoloV2: 2.7 Googles Object Detection APi with SSD_MobileNet: 4 fps

How much do you get and how could i speed up?

I documented my problem a little bit more in detail if you have a look here: https://devtalk.nvidia.com/default/topic/1027819/jetson-tx2/object-detection-performance-jetson-tx2-slower-than-expected/

Would be nice hearing from you!

Dec 20 '17 09:12 gustavz

@GustavZ I also have the same performance with TX2. sudo nvpmodel -m 0 sudo ./jetson_clocks.sh

Tiny-Yolo: 17.5 fps YoloV2: 2.7

So, I'll drive to the tensorRT with jetPack 3.2 beta.

Dec 28 '17 04:12 OseongKwon

darknet darknet copied to clipboard

Object detection is very slow

darknet
darknet copied to clipboard