usb_cam icon indicating copy to clipboard operation
usb_cam copied to clipboard

Framerate issue - 30fps not reached at full hd - ros humble

Open firesurfer opened this issue 1 year ago • 24 comments

Issue

On the image_raw and the compressed topic the desired framerate of 30 fps is not even closely reached. On average I reach with the default configuration around 16 fps.

Interestingly the update rate on the raw/compressed/info topics does not always match. But this could be a measuring issue.

The Webcam reports (also in the startup of the usb cam node) that it supports Full HD at 30Hz. Using VLC media player I can also play it at 30fps (if the media information displayed is correct)

System

Ubuntu 22.04 ROS2 Humble (binary install) usb_cam: 0.6.0

v4l2-ctl -D

Driver Info:
	Driver name      : uvcvideo
	Card type        : HD Pro Webcam C920
	Bus info         : usb-0000:0c:00.3-2
	Driver version   : 5.19.17
	Capabilities     : 0x84a00001
		Video Capture
		Metadata Capture
		Streaming
		Extended Pix Format
		Device Capabilities
	Device Caps      : 0x04200001
		Video Capture
		Streaming
		Extended Pix Format
Media Driver Info:
	Driver name      : uvcvideo
	Model            : HD Pro Webcam C920
	Serial           : 
	Bus info         : usb-0000:0c:00.3-2
	Media version    : 5.19.17
	Hardware revision: 0x0000000c (12)
	Driver version   : 5.19.17
Interface Info:
	ID               : 0x03000002
	Type             : V4L Video
Entity Info:
	ID               : 0x00000001 (1)
	Name             : HD Pro Webcam C920
	Function         : V4L2 I/O
	Flags            : default
	Pad 0x01000007   : 0: Sink
	  Link 0x0200001c: from remote pad 0x100000a of entity 'Processing 3' (Video Pixel Formatter): Data, Enabled, Immutable

Camera yaml

ros__parameters:
  video_device: "/dev/video0"
  framerate: 30.0
  io_method: "mmap"
  frame_id: "camera_1"
  pixel_format: "mjpeg2rgb"

color_format: "yuv422p" #Will result in less then 30fps on uncompressed topic

  image_width: 1920
  image_height: 1080
  camera_name: "logitech_c920"
  camera_info_url: "package://camera/config/camera_1_info.yaml"

Further info

I played around with the following parameters:

  1. Based on a comment in the main loop of the usb cam node I set a very small manual exposure time. This did not increase the framerate/update rate
  2. Selecting a userptr as io_method. Did increase the framerate by like 2 fps
  3. Reducing the image size from Full HD solved the issue. The frames are then published at 30fps
  4. color_format. See my comment in the camera yaml file. Setting an explicit color_format has a negative impact on the frame rate. The difference compressed/raw topic publish rate seemed to be higher.

EDIT:

Just saw #261 . Depending on how ros2 topic hz measures this could result in a wrong measurement I guess

firesurfer avatar Jul 17 '23 09:07 firesurfer

@firesurfer thanks for reporting this! So I've been able to track it down to the select() call. For some reason this is taking a very long time it looks like - from my tests it looks like ~50ms. I'll keep digging into it and provide a fix, apologies for the issue!

flynneva avatar Jul 21 '23 03:07 flynneva

@firesurfer did some more testing today after the timestamp update with #267 now merged to ros2 and it looks like only the image_raw topic reports a lower-than-normal hz with ros2 topic hz now. The image_raw/theora and image_compressed both output the expected / proper hz for me.

Also that select issue I was seeing is no longer there either, and now it seems to take the "expected" period of ~33ms for 30Hz and adjusts based on the selected framerate.

Can you confirm this is also what you are seeing?

flynneva avatar Jul 22 '23 20:07 flynneva

@flynneva #267 definitely seemed to bring an improvement. With a reduced image size also the image_raw topic reports 30Hz

firesurfer avatar Jul 24 '23 06:07 firesurfer

@firesurfer I'll go ahead and close this issue then 👍🏼

flynneva avatar Jul 24 '23 18:07 flynneva

@flynneva Sorry to bring up this issue again. I just tested again with the latest ros2-iron release and I run into the same issue. Camera works fine with vlc at 1080p@30Fps, with the usb_cam_node I can only reach around 15-20fps. Reducing to 720p solves the issue. Configuration is still the same as above.

firesurfer avatar Oct 04 '23 07:10 firesurfer

@firesurfer how are you measuring your FPS? Via ros2 topic hz? Which topic are you testing with? Compressed or uncompressed?

I am not able to reproduce your issue with the compressed topic at full resolution for my camera (1280 x 720) at 30 fps:

image

flynneva avatar Oct 06 '23 00:10 flynneva

I checked the FPS in two different ways:

  1. ros2 topic hz -> uncompressed tpic
  2. I have a node which writes video streams to files -> The resulting framerate in the file was also below 30fps

The issue only occurs at

image_width: 1920
image_height: 1080

If I reduce it to

image_width: 1280
image_height: 720

the framerate is correct (both types of measurement)

firesurfer avatar Oct 06 '23 09:10 firesurfer

@firesurfer and what's your cpu / memory usage when running at that full resolution? Are you topping out there at all?

flynneva avatar Oct 06 '23 14:10 flynneva

@flynneva I just ran some tests.

  1. 1280 x 720 : 25% cpu load per node, 30fps (ros2 topic hz) -> ros2 topic hz subscriptions increased cpu load by 10%
  2. 1920x 1080: 40% cpu load per node, 16fps In both cases i ran two camera nodes and subscribed to the image topics via rviz

As reference with vlc at 1920x1080 30-40% cpu at 30fps

In all test cases I made sure to not run any other cpu intensive workload during the test.

firesurfer avatar Oct 09 '23 10:10 firesurfer

@firesurfer awesome stuff! Ok so definitely doesn't look like you are topping out anything...give me a few days to debug a bit and circle back. Thanks again for reporting this and providing more info 🙏🏼

flynneva avatar Oct 10 '23 01:10 flynneva

@firesurfer generated a quick flamegraph to see where we can save the most time:

perf

Probably a little hard to see, but looks like most of the time is spent in libavcodec calls. I'll look into where we might be going wrong with the avcodec calls and see where we can improve 👍🏼

flynneva avatar Oct 17 '23 03:10 flynneva

@firesurfer looks like we might need to change a few things:

  1. In the avcodec space:
    1. Might need to adjust the avcodec_context settings to be multithreaded
    2. That link also says that maybe calling av_send_packet multiple times would help...but I'm not sure
  2. ROS 2 world: we could add a custom main and use the MultithreadedExecutor

flynneva avatar Oct 17 '23 03:10 flynneva

@flynneva I guess working on 1. makes the most sense. I also ran a callgrind analysis and also came to the conclusion that av_send_packet is where most of the time is spent.

If I am right, using a MultithreadExecutor shouldnt make too much differences regarding publishing of images as message are usually directly published in the publish call.

firesurfer avatar Oct 21 '23 16:10 firesurfer

@flynneva note to self: this PR / issue looks promising:

https://github.com/Kitware/kwiver/pull/1504

Do some testing this week to see if sending a nullptr av packet at the end of the logic helps with this issue 👍

fyi @firesurfer this is just a note to myself but you might also find it interesting

flynneva avatar Oct 30 '23 03:10 flynneva

Thanks for the info.

firesurfer avatar Oct 31 '23 09:10 firesurfer

@firesurfer could you test out #290 to see if it fixes your issue? I'm not really sure how to test to confirm this issue is resolved. If you have any ideas lmk, maybe we could add an integration test for it 👍🏼

flynneva avatar Nov 04 '23 17:11 flynneva

I will test this hopefully today.

As an idea for an integration test (I didnt take a look at the current tests):

You only need to generate a test stream of images at 30fps with 1920 x 1080 resolution and check if the resulting publishing rate is ~30Hz.

A simpler to implement intermediate check could be to introduce an update rate counter/timer for the publish calls and check if it averages at 30Hz

firesurfer avatar Nov 06 '23 09:11 firesurfer

@flynneva Just tested #290

Unfortunately it seems to make it worse. I only achieve 14fps on the raw image topic with this patch. On the compressed topic ros2 topic hz shows 25fps.

For completeness a recap what I did for testing:

  1. Adapt the camera config to 1920x1080
  2. Launch the camera node
  3. Subscribe to the raw image topic with rviz2
  4. run ros2 topic hz

firesurfer avatar Nov 06 '23 11:11 firesurfer

Hey @flynneva did you manage to reproduce this issue by now? (E.g. by providing the test image stream?)

firesurfer avatar Dec 19 '23 09:12 firesurfer

Hey @flynneva I have another update for you: I realized that even with 720p (1280 x 720) the frame rate drops the more cameras I add. E.g. at 7 cameras at the moment I only get ~15-20fps. The cameras are distributed to different USB host controllers. The system is far from being cpu capped btw.

So my guess would be that either something in the usb kernel subsystem limits the amount of data we can put through there or there is some kind of locking happening that makes it worse the more cameras we add.

firesurfer avatar Jun 12 '24 07:06 firesurfer

@firesurfer interesting. It's possible we are hitting the network buffer limits then since ROS2 still uses them even for local topics/messages.

Can you try increasing the buffers and retesting?

https://stackoverflow.com/questions/62227100/why-isnt-increasing-the-networking-buffer-sizes-reducing-packet-drops

flynneva avatar Jun 12 '24 12:06 flynneva

@firesurfer im realizing now that this is probably a QoS setting thing 😓 I should've noticed this before. Let me dig into it a bit today and see if we can quickly fix this with adjusting the QoS for the publishers

flynneva avatar Jun 16 '24 18:06 flynneva