autoware-documentation icon indicating copy to clipboard operation
autoware-documentation copied to clipboard

Update DDS isolation instructions for shared networks in Autoware

Open xmfcx opened this issue 1 year ago • 4 comments

Checklist

  • [X] I've read the contribution guidelines.
  • [X] I've searched other issues and no duplicate issues were found.
  • [X] I've agreed with the maintainers that I can plan this task.

Description

The current recommendation for Autoware users on crowded networks is to set export ROS_LOCALHOST_ONLY=1.

However, this approach does not effectively isolate DDS communication on a shared network, especially with ROS 2 humble.

Related issue:

  • https://github.com/ros2/rmw_cyclonedds/issues/370

What is wrong with the current recommendation?

Right now using ROS_LOCALHOST_ONLY=1 doesn't prevent multiple computers from talking to each other with DDS and ROS 2 humble.

Here are the steps to reproduce the issue:

  • Have 2 computers on the same network
  • On both computers run,
    • export ROS_LOCALHOST_ONLY=1
    • sudo ip link set lo multicast on
    • export ROS_DOMAIN_ID=0
  • On pc 1 run ros2 run demo_nodes_cpp talker
  • On pc 2 run ros2 run demo_nodes_cpp listener
  • You should be able to see the listener receiving messages from the talker

This shouldn't normally be possible. This will lead to network congestions and other issues.

Purpose

The purpose of this issue is to address and rectify the inefficacy of the ROS_LOCALHOST_ONLY=1 setting in isolating DDS traffic on crowded networks for Autoware users.

This is to prevent network congestion and to ensure that the DDS communications are confined to the intended network interfaces.

The resolution of this issue is crucial for maintaining network efficiency and performance, particularly in large office environments where multiple instances of Autoware might be operating on the same network.

Possible approaches

Update the documentation to have following instructions:

  • set up cyclonedds.xml with:
    <Interfaces>
        <NetworkInterface name="lo" priority="default" multicast="default" />
    </Interfaces>
    
  • remove the export ROS_LOCALHOST_ONLY=1 line from .bashrc
    • keeping it can cause lo: the same interface may not be selected twice error.
  • stop the existing ROS 2 daemon: ros2 daemon stop
  • enable multicast for loopback interface: sudo ip link set lo multicast on

Definition of done

The docs are updated.

xmfcx avatar Nov 17 '23 16:11 xmfcx

From the discussion in the Software WG Meeting 2023/11/21 we should also modify:

  • https://tier4.github.io/AWSIM/DeveloperGuide/TroubleShooting/

Also raise this issue on ROS Humble documentation.

xmfcx avatar Nov 28 '23 14:11 xmfcx

@xmfcx I encountered this problem and found the cause to solve it successfully https://autowarefoundation.github.io/autoware-documentation/main/installation/additional-settings-for-developers/#tuning-dds image image

The heart of the problem lies in autodetermine="true". image

Solution:

step-1 configure

In the .bashrc comments

# export ROS_DOMAIN_ID=27
# export ROS_LOCALHOST_ONLY=1

Modify the cyclonedds.xml file <NetworkInterface name="lo" priority="default" multicast="default" />

<?xml version="1.0" encoding="UTF-8" ?>
<CycloneDDS xmlns="https://cdds.io/config" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="https://cdds.io/config https://raw.githubusercontent.com/eclipse-cyclonedds/cyclonedds/master/etc/cyclonedds.xsd">
<Domain Id="any">
        <General>
            <Interfaces>
                <!-- <NetworkInterface autodetermine="true" priority="default" multicast="default" /> -->
                <NetworkInterface name="lo" priority="default" multicast="default" />
            </Interfaces>
            <AllowMulticast>default</AllowMulticast>
            <MaxMessageSize>65500B</MaxMessageSize>
        </General>
        <Internal>
            <SocketReceiveBufferSize min="10MB"/>
            <Watermarks>
                <WhcHigh>500kB</WhcHigh>
            </Watermarks>
        </Internal>
    </Domain>
</CycloneDDS>

step-2 restart the computer

zymouse avatar Jan 10 '24 09:01 zymouse

@zymouse it seems that you've come up with the same solution that I proposed (in the possible approaches section in my post), good to see your confirmation. This issue is here to update the related documentations.

xmfcx avatar Jan 10 '24 10:01 xmfcx

This pull request has been automatically marked as stale because it has not had recent activity.

stale[bot] avatar Mar 10 '24 11:03 stale[bot]