rmw_cyclonedds
rmw_cyclonedds copied to clipboard
Poor Image Transport Performace
Bug report
Required Info:
- Operating System:
- Ubuntu 18.04
- Installation type:
- Source
- Version or commit hash:
- Latest Foxy
- DDS implementation:
- CycloneDDS
- Client library (if applicable):
- N/A
Steps to reproduce issue
Send a 1080p raw image over wifi from a xavier agx computer and an amd64 laptop.
Expected behavior
As with ROS 1, the image may arrive late, but will consistently arrive with little overhead.
Actual behavior
An image never makes it through to my laptop, and the network traffic increases to a point where my ssh session becomes unusable.
Additional information
It should be noted that the image does arrive if I make it quarter resolution.
The Ubuntu 18.04 xavier does run foxy, which is why we build from source.
Did you ever solve this? I have the same problem with pointclouds.
On my local laptop I can echo the msg. But on a remote computer I can get the topic list and also echo successfully /tf and other msgs. But I never get the pointcloud.
@brennand there are a few things you can check that may make the difference, while we try to figure out what exactly causes the problems @allenh1 describes.
First things to do is:
- Check the QoS: if it is best-effort, any lost packet will automatically mean the entire point cloud is lost. Packet loss because of noise is pretty rare for wired networks/unicast on WiFi, but buffer overflows are always a possibility.
- Check the size of your socket receive buffers: the data burst from a pointcloud can overrun with the default settings, making a full point cloud fit in the buffer makes that really unlikely
- the Linux default maximum is a mere 400kB or so (https://stackoverflow.com/questions/16460261/linux-udp-max-size-of-receive-buffer)
- Cyclone by default asks for 1MB and accepts what it gets, you can require more (https://github.com/eclipse-cyclonedds/cyclonedds/blob/master/docs/manual/options.md#cycloneddsdomaininternalminimumsocketreceivebuffersize)
- If there is WiFi involved, but the nodes that are trying to transfer pointclouds over the network are using a wired connection to an access point, Cyclone won't detect that it is dealing with WiFi and might start using multicast instead. Multicast and WiFi are a terrible combination. In such a case https://github.com/eclipse-cyclonedds/cyclonedds/blob/master/docs/manual/options.md#cycloneddsdomaingeneralallowmulticast is useful (especially the "spdp" setting)
I don't know if this will make any difference, but it is worth looking into. Perhaps https://github.com/ros2/rmw_cyclonedds/issues/251 gives some useful information, too.
@allenh1 and @brennand , can I check if you are building foxy
in Release
mode on the AGX? Following the default instructions will not build in Release
and that will have a significant impact on image transport performance.
This in combination with @eboasson 's recommendations should work well.
This issue has been mentioned on ROS Discourse. There might be relevant details there:
https://discourse.ros.org/t/ros2-speed/20162/25
@allenh1 and @brennand , can I check if you are building foxy in Release mode on the AGX?
This is built with using -DCMAKE_BUILD_TYPE=Release
.
Is there a way to adjust QoS with image_transport
? I haven't seen a way in Foxy. Would be great if there was!
Hi everyone. Any update for this issue? We're having same problem when publishing large data of image, pointcloud with size more than 5MB to remote node subscription as well. The subscription rate was dropped from 30hz to 10hz