viseron
viseron copied to clipboard
Faster than 1 second inference?
First, thanks for publishing this project.
I'm trying to infer objects faster than 1 FPS. My camera is 20 FPS.
I have the motion detector down to 0.05 (1/20). Object detection interval down to 0.05. Looking at the debugs, I am guessing I get about 1-2, maybe 3 FPS send into the detector, it's difficult to tell as the viseron logs say the same message (objects detected []) was repeated X times, but it doesn't timestamp each detection log.
My Jetson Nano is idling around 20% cpu across the cores, and barely hitting the GPU (watching via jtop) - and I want to process more FPS for a realtime vehicle application.
I've read some docs and the previous issues. It seems like I need to decrease the motion detector, but when I decrease it below 0.05, I get divide by 0 errors in the code.
How can I increase the FPS pushed into the Object Detection engine and not hit divide by 0 errors?
Here's my config
cameras:
- name: Camera
host: 192.168.123.132
port: 554
username: <if auth is enabled>
password: <if auth is enabled>
path: /live.sdp
width: 1920
height: 1920
fps: 20
motion_detection:
interval: 0.05
trigger_detector: false
trigger_recorder: false
timeout: true
max_timeout: 30
width: 416
height: 416
area: 0.1
threshold: 1
frames: 1
object_detection:
type: darknet
interval: 0.05
log_all_objects: true
logging:
level: debug
Thanks for showing interest in Viseron!
You should not have to decrease the interval any lower, i suspect the bottleneck may be elsewhere. Hard to guess where tho.
Do you get faster detections if you swap the model for the yolov3-tiny version?
object_detection:
type: darknet
interval: 0.05
model_path: /detectors/models/darknet/yolov3-tiny.weights
log_all_objects: true
Edit: Also the motion detector interval can be set to a higher number without affecting the object detector
Googling a bit on using the Nano with YOLOv4 on OpenCV it seems that the FPS is generally quite low.
This post points towards around 2 FPS. https://forums.developer.nvidia.com/t/yolov4-with-opencv/158725
To utilize the Nano better it seems other tools and models need to be used. Is this something you have experience with?
Yeah, I do have some experience there, so I know the targets I want to hit from that experience. I made my own multithreaded threaded Python engine that uses the nvidia optimized gstreamer to feed from a RTSP h264 stream camera, run Darknet Yolov3-Tiny inference on realtime frames, have a listener for MQTT control and notifications, and that pushes captured images and objects down via MQTT. Many similarities with your project! I can get about 12 FPS detection out of Yolov3-tiny on the Nano with a custom trained 416x416 model.
If you look at the second response on the thread you posted, he confirms also getting 12FPS on Yolov4-Tiny on the Nano.
You can see the realtime cpu and cpu utilization view using jetsonstats https://github.com/rbonghi/jetson_stats All cpu cores hovers around 20%, and the GPU is barely ever touched.
I changed the model_path and model_config to tiny, but see the same results. It's about 1FPS, and the resources of the box are not tapped hardly at all. It just posts that Objects [] were found, about 1 message repeated per second.
Here's the traceback, after I change motion and detector intervals to 0.025 to try to get (2 out of every 20 frames inspected instead of just 1 - 0.05)
viseron | Exception in thread viseron.camera.cisco: viseron | Traceback (most recent call last): viseron | File "/usr/local/lib/python3.8/threading.py", line 932, in _bootstrap_inner viseron | self.run() viseron | File "/usr/local/lib/python3.8/threading.py", line 870, in run viseron | self._target(*self._args, **self._kwargs) viseron | File "/src/viseron/camera/init.py", line 114, in capture_pipe viseron | decoder.scan_frame(current_frame) viseron | File "/src/viseron/camera/frame_decoder.py", line 93, in scan_frame viseron | if self._frame_number % self._interval_fps == 0: viseron | ZeroDivisionError: integer division or modulo by zero
Yeah, I do have some experience there, so I know the targets I want to hit from that experience. I made my own multithreaded threaded Python engine that uses the nvidia optimized gstreamer to feed from a RTSP h264 stream camera, run Darknet Yolov3-Tiny inference on realtime frames, have a listener for MQTT control and notifications, and that pushes captured images and objects down via MQTT. Many similarities with your project! I can get about 12 FPS detection out of Yolov3-tiny on the Nano with a custom trained 416x416 model.
That sounds awesome! Do you have your code posted anywhere? Would love to have a look.
Would be great to make a tailored solution for the Nano, but I dont own a Nano myself sadly so creating something like that is very hard for me on my own (took me ages to get it running on the Nano in the first place!) I have some work going on right now where im trying to make Viseron more modular, and also the interfacing with the cameras. Right now FFMPEG is the only possibility but i would like to be able to utilize, in this instance, gstreamer as you mentioned.
Here's the traceback, after I change motion and detector intervals to 0.025 to try to get (2 out of every 20 frames inspected instead of just 1 - 0.05)
viseron | Exception in thread viseron.camera.cisco: viseron | Traceback (most recent call last): viseron | File "/usr/local/lib/python3.8/threading.py", line 932, in _bootstrap_inner viseron | self.run() viseron | File "/usr/local/lib/python3.8/threading.py", line 870, in run viseron | self._target(*self._args, **self._kwargs) viseron | File "/src/viseron/camera/init.py", line 114, in capture_pipe viseron | decoder.scan_frame(current_frame) viseron | File "/src/viseron/camera/frame_decoder.py", line 93, in scan_frame viseron | if self._frame_number % self._interval_fps == 0: viseron | ZeroDivisionError: integer division or modulo by zero
interval: 0.05
should already be working at 20 FPS for you.
interval
is specified in seconds, so if you take 1/20 = 0.05 it will inspect every frame.
However it doesnt seem like the current implementation can keep up with that.
That sounds awesome! Do you have your code posted anywhere? Would love to have a look.
@jasonbarbee Yes, please. lets have a look?
Update - I got permission to share the code, need a little time to test and write a readme - will update here soon when it's ready.
Closing due to inactivity