darknet
darknet copied to clipboard
Object detection is very slow
I have installed Cuda & OpenCV for object detection as written in docs.
And tried to detect objects on the video file and got nearly FPS: 2.8.
./darknet detector demo cfg/coco.data cfg/yolo.cfg yolo.weights video-file.mp4
Then I tried to use cudNN, but result was FPS: 2.0.
What should I do to get higher FPS?
Environment Ubuntu 16.04 CUDA Version 8.0.61 OpenCV 3.2 GPU: GeForce 840M (2GB) Driver: NVIDIA 375.66
@AlexeyAB 1920x1080. Is it big? What is the eligible resolution for darknet?
Try to decrease width=416
and height=416
in the yolo.cfg file: https://github.com/pjreddie/darknet/blob/master/cfg/yolo.cfg#L8
Resolution 1920x1080 of video file is normanl for Yolo, but also try to use lower resolution video file.
I have decreased width and height in the config file. Now FPS is ~5.4 which is good, but it is not enough. Btw, I tried to use lower video but result was the same.
@AlexeyAB thank you.
hi Isabek, I met with the same problem as you described and wonder if you've solved it?
Hi @DennisWangCW, If you want to reach very high FPS as shown on darknet documentation you need this kind of computer.
Btw, you can train you own model. Because YOLO tries to classify a lot of objects.
P.S. I couldn't solve it. As @AlexeyAB said, I can reach only 7.5 FPS with my GPU which is not enough for me.
@Isabek Hi,
You can use Tiny-Yolo instead of Yolo, so you can get about ~15 FPS on GeForce 840M: ./darknet detector demo cfg/coco.data cfg/tiny-yolo.cfg tiny-yolo.weights video-file.mp4
-
tiny-yolo.cfg
: https://raw.githubusercontent.com/pjreddie/darknet/master/cfg/tiny-yolo.cfg -
tiny-yolo.weight
: https://pjreddie.com/media/files/tiny-yolo.weights
Also you can decrease width=288
and height=288
in the tiny-yolo.cfg
so you can get about ~30 FPS on GeForce 840M.
But each such step worsens the detection accuracy.
Thank you @AlexeyAB.
I decreased width and height to 288
in tiny-yolo.cfg
and changed video file resolution to 960x540
. Now FPS is ~25
which is cool. FYI @DennisWangCW
@AlexeyAB how can I train my own model? Is it possible? I have my own image dataset.
@Isabek Yes: https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects
Hi @AlexeyAB. I am bit confused about absolute_x
, absolute_height
. Could explain them? Thanks!
@Isabek You have an image 1920x1080, and object-1 with center (100,200) and width=50,height=20
So for this object: absolute_x
= 100, absolute_y
= 200, absolute_width
= 50, absolute_height
= 20
According to <x> = <absolute_x> / <image_width>
or <height> = <absolute_height> / <image_height>
Into txt file you should write: 1 0.052 0.185 0.026 0.0185
Or just use this tool: https://github.com/AlexeyAB/Yolo_mark
@AlexeyAB I have 6 classes. How many images should I choose for each class? I selected 100 images for each class and weights file for 500 iterations weighs 256M.
@Isabek I still don't solve the problem, but whatever, thank you.
@Isabek Usually 500 - 2000 images for each class (object) is enough. And should be trained (2000 X number_of_classes) iterations.
@AlexeyAB thank you. I have started training 2 days ago. It is training my model still :)
I played around with yolo.cfg and tiny-yolo.cfg (width=416 and height=416) config files, but performance lags around 4 fps reading from a video file and writing output to another video file(avi). What type of performance should developers expect using the Nvidia TX1 platform???
Also, when I run the web cam demo, I get about 10-12 FPS with tiny-yolo and a Sony Playstation Eye. Do I need to set the camera resolution somewhere?
nvidia@tegra-ubuntu:~/darknet$ ./darknet detector demo cfg/voc.data cfg/tiny-yolo-voc.cfg weights/tiny-yolo-voc.weights
FPS:11.2
Thx.
I'm experiencing about same performance on TX2, with tiny yolo there is a lot of headroom on the tx2 to spawn more processes. I'm able to get ~30 fps with 4 simultaneous processes.
@TheMikeyR Did you say 30fps on the TX2? Is that the capture rate or processing rate? What does your config/setup (hw/sf) look like for tiny yolo? Are you using Opencv 3? My understanding is that OpenCV is not very good at Video I/O (ffmpeg/gstreamer). What type of camera are you using?
@kaisark I'm processing offline video
Using sudo jetson_clocks.sh
from home directory (should be installed with JetPack) and then I'm using nvpmodel -m 2
which turns all cores on (from 4 to 6 cores) with MAXP.
I didn't modify the original tiny-yolo much, I have one class to predict so I've of course modified the filters and classes.
I've compiled with CUDNN=1
GPU=1
OPENCV=1
but then I've removed the "viewing" part of the demo function, so it doesn't display the results but only prints in the terminal.
I've uncommented this line https://github.com/pjreddie/darknet/blob/c7252703420159a9f3a1ec416b1b4326c4c95402/src/demo.c#L194 to prevent OpenCV from displaying the video which speeds it up (it still show the detected objects in the terminal)
Lastly I'm opening 4 terminals and running the same command in all of them ./darknet detector demo data/rgb.data cfg/tiny-yolo.cfg ~/data/create/detection_annotation/yolo/tiny/tiny-yolo_50000.weights ~/data/create/videos/summarized_right_10min.mp4 -i 0
I believe the FPS is for the entire processing rate, since it gets called one place in demo.c and is not displayed before it gets to the same place again, so I assume it is the entire pipeline.
The camera which have been used is the Zed camera and then only right view of the rgb stream.
@Isabek Hi, did you trained ur own model with lower class of objects? Does it improve fps?
@xhuvom I have trained my own model with my collected dataset. But result is the same. FPS is 3.7. You can watch result here https://www.youtube.com/watch?v=QopUtQobWJ0
@AlexeyAB I would like to buy a new video card. And I am little bit confused. What is the difference between MSI, Asus, EVGA and Zotac?
I am planning to buy GTX 1070. How many frames per second can I reach with GTX 1070 on YOLO?
@Isabek primarily the cooler and some of the cards are factory overclocked (run faster). Here is a list http://thepcenthusiast.com/geforce-gtx-1070-compared-asus-evga-zotac-msi-gigabyte/ where you can filter compared to clock etc. In the end it doesn't matter much, you can also overclock the card yourself and just go with the cheaper one. It's a silicon lottery, sometimes you get a chip which can overclock a lot and other times you can't achieve anything over stock speeds.
Can't help with FPS, depends on many things.
@Isabek You can achive about ~0.01 FPS per 1 GFlops-SP using yolo-voc.cfg network 416x416 on my fork.
Look at Single precision
for your GPU: https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#GeForce_10_series
- so on GeForce GTX 1070 -
6462
GFlops-SP = ~64 FPS
As the TheMikeyR said correctly, GPU of different companies are distinguished by a small overclocking of GPU and a cooling system, and they can also differ in reliability.
I am bit confused about Precision. How can I calculate model's precision? Inside function which calculates recall and IOU I want to calculate precision. I found an answer for my question. But I am not sure.
@AlexeyAB I need recommendation for kind/brand/model of camera to use for training,detection and recognition of multiple faces at a time from a live stream (e.g a check-in counter). Please help
@TheMikeyR Hey i am also working on the Jetson Tx2 and i get following Fps with
sudo nvpmodel -m 0
sudo ./jetson_clocks.sh
Tiny-Yolo: 17.5 fps YoloV2: 2.7 Googles Object Detection APi with SSD_MobileNet: 4 fps
How much do you get and how could i speed up?
I documented my problem a little bit more in detail if you have a look here: https://devtalk.nvidia.com/default/topic/1027819/jetson-tx2/object-detection-performance-jetson-tx2-slower-than-expected/
Would be nice hearing from you!
@GustavZ I also have the same performance with TX2. sudo nvpmodel -m 0 sudo ./jetson_clocks.sh
Tiny-Yolo: 17.5 fps YoloV2: 2.7
So, I'll drive to the tensorRT with jetPack 3.2 beta.