rmw_fastrtps icon indicating copy to clipboard operation
rmw_fastrtps copied to clipboard

Messages dropped while multiple publishers are publishing,help with XML profile.

Open yashmewada9618 opened this issue 11 months ago • 4 comments

  • Operating System:
    • Ubuntu 22.04
  • Installation type:
    • source
  • Version or commit hash:
    • Humble release
  • DDS implementation:
    • Fast-RTPS

Description

How to write Fast RTPS XML profile to publish in unicast mode with modified socket buffer sizes similar to the provided cycloneDDS profile.

Cyclone DDS

<?xml version="1.0" encoding="UTF-8"?>
<CycloneDDS xmlns="https://cdds.io/config" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="https://cdds.io/config https://raw.githubusercontent.com/eclipse-cyclonedds/cyclonedds/master/etc/cyclonedds.xsd">
    <Domain id="any">
        <Internal>
            <SocketReceiveBufferSize min="250MB" />
        </Internal>
        <General>
            <Interfaces>
                <NetworkInterface name="lo" />
            </Interfaces>
            <AllowMulticast>false</AllowMulticast>
        </General>
        <Discovery>
            <ParticipantIndex>auto</ParticipantIndex>
            <Peers>
                <Peer Address="localhost" />
            </Peers>
            <MaxAutoParticipantIndex>120</MaxAutoParticipantIndex>
        </Discovery>
    </Domain>
</CycloneDDS>

My not working Fast RTPS profile

<?xml version="1.0" encoding="UTF-8"?>
<dds xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
    <profiles>
        <!-- Default publisher profile -->
        <publisher profile_name="default publisher profile" is_default_profile="true">
            <topic>
                <resourceLimitsQos>
                    <max_samples>0</max_samples>
                    <max_instances>0</max_instances>
                    <max_samples_per_instance>0</max_samples_per_instance>
                    <allocated_samples>0</allocated_samples>
                </resourceLimitsQos>
            </topic>
            <qos>
                <reliability>
                    <max_blocking_time>
                        <sec>DURATION_INFINITE_SEC</sec>
                    </max_blocking_time>
                </reliability>
                <publishMode>
                    <kind>SYNCHRONOUS</kind>
                </publishMode>
                <data_sharing>
                    <kind>OFF</kind>
                </data_sharing>
            </qos>
            <historyMemoryPolicy>PREALLOCATED_WITH_REALLOC</historyMemoryPolicy>
        </publisher>

        <!-- Default subscriber profile -->
        <subscriber profile_name="default subscriber profile" is_default_profile="true">
            <qos>
                <data_sharing>
                    <kind>OFF</kind>
                </data_sharing>
            </qos>
            <historyMemoryPolicy>PREALLOCATED_WITH_REALLOC</historyMemoryPolicy>
        </subscriber>

        <transport_descriptors>
            <transport_descriptor>
                <transport_id>UdpTransport</transport_id>
                <type>UDPv4</type>
            </transport_descriptor>
        </transport_descriptors>

        <participant profile_name="udp_transport_profile" is_default_profile="true">
            <rtps>
                <sendSocketBufferSize>1048576</sendSocketBufferSize>
                <listenSocketBufferSize>4194304</listenSocketBufferSize>
                <builtinTransports>LARGE_DATA</builtinTransports>
                <userTransports>
                    <transport_id>UdpTransport</transport_id>
                </userTransports>
                <useBuiltinTransports>true</useBuiltinTransports>
            </rtps>
        </participant>
    </profiles>
</dds>

Details

I have an Ouster OS1-32U 3D lidar and an Intel Realsense and they both combined make up to 35 publishers including the depth clouds from both the sensors and their respective IMU data. The incoming data is as I require viz IMU at 200Hz, OS cloud at 20, and Realsense Cloud at 30Hz. But when I start recording the bag file and play the recorded bag file my messages are not as per the frame rates I mentioned above and they are significantly less (almost 0 for a few seconds).

Previously I was using Cyclone DDS which I think is bottlenecking the data transport hence I want to shift to FastRTPS, but it seems I am unable to reproduce the same profile manner here i.e (Publish in unicast mode and only to the localhost and not any other connected PC, increase the incoming buffer size).

Steps to reproduce issue

  • Launch the Ouster lidar and realsense camera nodes, then launch the rosbag record node. Launch file
ouster_launch = IncludeLaunchDescription(
        XMLLaunchDescriptionSource(
            [
                PathJoinSubstitution(
                    [FindPackageShare("ouster_ros"), "launch", "sensor.launch.xml"]
                )
            ]
        ),
        launch_arguments={
            "viz": "false",
            "sensor_hostname": "os-122212000733.local",  # Replace with the actual sensor hostname
            "lidar_mode": "1024x20",
            "udp_profile_lidar": "RNG19_RFL8_SIG16_NIR16",
            "metadata": "/home/scout/autonomy-stack/src/os-122212000733-metadata.json",  # Replace with relative path to metadata file
            "timestamp_mode": "TIME_FROM_ROS_TIME",
        }.items(),
    )
realsense_launch = IncludeLaunchDescription(
        PythonLaunchDescriptionSource(
            [
                PathJoinSubstitution(
                    [FindPackageShare("realsense2_camera"), "launch", "rs_launch.py"]
                )
            ]
        ),
        launch_arguments={
            "initial_reset": "true",
            "pointcloud.enable": "true",
            "enable_infra1": "false",
            "enable_infra2": "false",
            "enable_sync": "true",
            "align_depth.enable": "true",
            "enable_gyro": "true",
            "enable_accel": "true",
            "gyro_fps": "200",
            "accel_fps": "200",
            "unite_imu_method": "2",
        }.items(),
    )
    
    # No compression of bag file, just splitting.
rosbag_record = ExecuteProcess(
        cmd=[
            "ros2",
            "bag",
            "record",
            "-o",
            directory,
            "--qos-profile-overrides-path",
            Qos_override,
            "--max-cache-size",
            "1000000000",  # 1 GB
            "--max-bag-size",
            "3221225472",  # 3Gb of raw data
            "--max-bag-duration",
            "1800",  # seconds
            "-s",
            "sqlite3",
            "-a",
        ],
        shell=True,
    )

Expected behavior

  • No messages should be lost in the rosbag record.

Actual behavior

[ros2-10] 	/camera/camera/color/metadata: 329
[ros2-10] 	/ouster/range_image: 219
[ros2-10] 	/camera/camera/gyro/metadata: 2184
[ros2-10] 	/ouster/nearir_image: 219
[ros2-10] 	/ouster/reflec_image: 219
[ros2-10] 	/ouster/signal_image: 219
[ros2-10] 	/camera/camera/gyro/sample: 2184
[ros2-10] 	/camera/camera/accel/metadata: 1110
[ros2-10] 	/camera/camera/imu: 2184
[ros2-10] 	/ouster/scan: 219
[ros2-10] 	/camera/camera/color/image_raw: 329
[ros2-10] 	/ouster/points: 727
[ros2-10] 	/camera/camera/aligned_depth_to_color/image_raw: 327
[ros2-10] 	/camera/camera/aligned_depth_to_color/camera_info: 327
[ros2-10] 	/camera/camera/depth/image_rect_raw: 328
[ros2-10] 	/camera/camera/depth/color/points: 327
[ros2-10] 	/ouster/imu: 1935
[ros2-10] 	/camera/camera/depth/camera_info: 328
[ros2-10] 	/camera/camera/color/camera_info: 329
[ros2-10] 	/camera/camera/accel/sample: 1110
[ros2-10] 	/camera/camera/depth/metadata: 328
[ros2-10] Total lost: 15481

Thanks.

yashmewada9618 avatar Mar 05 '24 18:03 yashmewada9618

@yashmewada9618 out of curiosity.

i.e (Publish in unicast mode and only to the localhost and not any other connected PC, increase the incoming buffer size).

for localhost, ROS 2 provides ROS_LOCALHOST_ONLY environment variable to isolate the communication in the localhost. this is much easier and agnostic from rmw implementation.

besides that, why did you disable data sharing and shared memory transport? i think those are meant to be used for localhost low latency high throughput communication.

fujitatomoya avatar Mar 05 '24 23:03 fujitatomoya

CC: @MiguelCompany @EduPonz

fujitatomoya avatar Mar 05 '24 23:03 fujitatomoya

@fujitatomoya I tried using the ROS_LOCALHOST_ONLY but it didn't work in my case, a more detailed problem can be found here. Regarding data sharing and shared memory transport, I was looking into the config on how fastrtps performs with my sensor suite, and no specific reason for it.

yashmewada9618 avatar Mar 09 '24 19:03 yashmewada9618

I tried using the ROS_LOCALHOST_ONLY but it didn't work in my case, a more detailed problem can be found https://github.com/ouster-lidar/ouster-ros/issues/278.

if it does not work only in localhost, that is the problem...

https://github.com/ouster-lidar/ouster-ros/issues/278 does not say any details either, just said didn't work in our case.

are you saying even with ROS_LOCALHOST_ONLY is enabled, endpoints can communicate each other beyond the network?

fujitatomoya avatar Mar 11 '24 16:03 fujitatomoya