rmw_cyclonedds icon indicating copy to clipboard operation
rmw_cyclonedds copied to clipboard

Poor Image Transport Performace

Open allenh1 opened this issue 3 years ago • 6 comments

Bug report

Required Info:

  • Operating System:
    • Ubuntu 18.04
  • Installation type:
    • Source
  • Version or commit hash:
    • Latest Foxy
  • DDS implementation:
    • CycloneDDS
  • Client library (if applicable):
    • N/A

Steps to reproduce issue

Send a 1080p raw image over wifi from a xavier agx computer and an amd64 laptop.

Expected behavior

As with ROS 1, the image may arrive late, but will consistently arrive with little overhead.

Actual behavior

An image never makes it through to my laptop, and the network traffic increases to a point where my ssh session becomes unusable.

Additional information

It should be noted that the image does arrive if I make it quarter resolution.

The Ubuntu 18.04 xavier does run foxy, which is why we build from source.

allenh1 avatar Jul 29 '21 15:07 allenh1

Did you ever solve this? I have the same problem with pointclouds.

On my local laptop I can echo the msg. But on a remote computer I can get the topic list and also echo successfully /tf and other msgs. But I never get the pointcloud.

brennand avatar Aug 09 '21 12:08 brennand

@brennand there are a few things you can check that may make the difference, while we try to figure out what exactly causes the problems @allenh1 describes.

First things to do is:

  • Check the QoS: if it is best-effort, any lost packet will automatically mean the entire point cloud is lost. Packet loss because of noise is pretty rare for wired networks/unicast on WiFi, but buffer overflows are always a possibility.
  • Check the size of your socket receive buffers: the data burst from a pointcloud can overrun with the default settings, making a full point cloud fit in the buffer makes that really unlikely
    • the Linux default maximum is a mere 400kB or so (https://stackoverflow.com/questions/16460261/linux-udp-max-size-of-receive-buffer)
    • Cyclone by default asks for 1MB and accepts what it gets, you can require more (https://github.com/eclipse-cyclonedds/cyclonedds/blob/master/docs/manual/options.md#cycloneddsdomaininternalminimumsocketreceivebuffersize)
  • If there is WiFi involved, but the nodes that are trying to transfer pointclouds over the network are using a wired connection to an access point, Cyclone won't detect that it is dealing with WiFi and might start using multicast instead. Multicast and WiFi are a terrible combination. In such a case https://github.com/eclipse-cyclonedds/cyclonedds/blob/master/docs/manual/options.md#cycloneddsdomaingeneralallowmulticast is useful (especially the "spdp" setting)

I don't know if this will make any difference, but it is worth looking into. Perhaps https://github.com/ros2/rmw_cyclonedds/issues/251 gives some useful information, too.

eboasson avatar Aug 10 '21 13:08 eboasson

@allenh1 and @brennand , can I check if you are building foxy in Release mode on the AGX? Following the default instructions will not build in Release and that will have a significant impact on image transport performance. This in combination with @eboasson 's recommendations should work well.

Yadunund avatar Oct 13 '21 01:10 Yadunund

This issue has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/ros2-speed/20162/25

ros-discourse avatar Oct 15 '21 18:10 ros-discourse

@allenh1 and @brennand , can I check if you are building foxy in Release mode on the AGX?

This is built with using -DCMAKE_BUILD_TYPE=Release.

Is there a way to adjust QoS with image_transport? I haven't seen a way in Foxy. Would be great if there was!

allenh1 avatar Nov 02 '21 21:11 allenh1

Hi everyone. Any update for this issue? We're having same problem when publishing large data of image, pointcloud with size more than 5MB to remote node subscription as well. The subscription rate was dropped from 30hz to 10hz

nguyentuanHUST avatar Aug 07 '23 08:08 nguyentuanHUST