darknet_ros
darknet_ros copied to clipboard
YOLO v4 + ROS2 Humble (Foxy) + CUDA11 + cuDNN (FP16)
Hello. I am a college student in Japan and a fan of darknet_ros.
I've been wanting to make the ROS2 + YOLO v4 implementation happen for a long time, and I'm happy to report that I was able to implement it.
Main changes (my commits -> foxy)
- Support for YOLO v4 : Switched the submodule to the master branch of AlexeyAB/darknet.
- Removed IPL : Switched from IPL to CV::Mat for OpenCV4 support.
- cuDNN:fire::fire:: supported cuDNN & FP16
Requirements
- ROS2 Foxy
- OpenCV4 ($ sudo apt install ros-foxy-vision-opencv)
- CUDA 10 or 11 (tested with CUDA 11.3)
- cuDNN 8 (Optional)
Installation
$ source /opt/ros/foxy/setup.bash
$ mkdir -p ~/ros2_ws/src
$ cd ~/ros2_ws/src
$ git clone --recursive https://github.com/Ar-Ray-code/darknet_ros_yolov4.git
$ darknet_ros_yolov4/darknet_ros/rm_darknet_CMakeLists.sh
$ cd ~/ros2_ws
$ colcon build --symlink-install
Demo
Connect your webcam to your PC.
Terminal
$ source /opt/ros/foxy/setup.bash
$ source ~/ros2_ws/install/local_setup.bash
$ ros2 launch darknet_ros demo-v4-tiny.launch.py

Performance
Using YOLO v4 consumes a lot of GPU memory and lowers the frame rate, so you need to pay attention to your PC specs.
Test Machine
| Topics | Spec |
|---|---|
| CPU | Ryzen7 2700X (@3.7GHz x 16) |
| RAM | 16GB DDR4 |
| GPU | NVIDIA GeForce RTX 2080 Ti (GDDR6 11GB) |
| Driver | 460.32.03 |
Performance
YOLO v3 : 67 fps (72 ~ 62 fps), uses 1781MB of VRAM YOLO v4 : 29 fps (27 ~ 30.5 fps), uses 3963MB of VRAM
Please give it a try. Thank you.
Exciting work, thank you. We'll try to evaluate your work and come back to this asap.
By supporting cuDNN (FP16), I have succeeded in increasing the speed by 1.3 times. Please see the following report. Also, CPU-only inference is not supported at this stage.
This repository explains it.

English -> https://github.com/Ar-Ray-code/darknet_ros_fp16/wiki/Darknet_ros_FP16-Report-(1.3x-faster)-%F0%9F%94%A5 日本語→ https://zenn.dev/array/articles/4c82fc8382e62d
Dear @Ar-Ray-code, first of all, sorry for the slow response. @mbjelonic and I like your contributions. Right now, we're not using Darknet for ROS 2 on our real robots (we're using Noetic, since real-robot development is a bit slower than pure software development). So for the time being I propose that the foxy branch will be more of a 'community' branch, instead of a leggedrobotics' supported branch. On this branch, we can be more flexible and quicker in merging PRs.
Now I only have to see how I can resolve any conflicts between this PR and #337 . If you have a suggestion, feel free to let me know.
@tomlankhorst and @Ar-Ray-code maybe we can merge this branch and give @Ar-Ray-code permissions to check PRs to the foxy branch?
Changed CMakeLists.txt to work correctly on CPU. OpenMP is used. https://github.com/leggedrobotics/darknet_ros/pull/319/commits/706dce051f4dacc345dd3ebf1166df34d35e05c6
Did you devel on top of the master or the foxy branch, @Ar-Ray-code? Could you rebase such that only your changes are included?
I develop this on the master branch.
Did the build for the GPU work? If you have any questions, please let me know :)
I will support ROS-Humble and Ampere architecture. Are there any plans to create a Humble branch?
Hello, bro! Your code and advising help us. Thank you so much. I have a error. Error: CMake Error at /usr/share/cmake-3.16/Modules/FindCUDA.cmake:707 (message): Specify CUDA_TOOLKIT_ROOT_DIR
So, it is not building colcon. How can I solve problem. Help me!!
--> Feb 9 17:18 in korea : I guess CUDA version problem. I have another issuse.