Fast-DDS icon indicating copy to clipboard operation
Fast-DDS copied to clipboard

[DATASHARING_PAYLOADPOOL Error] Failed to create segment fast_datasharing_

Open shan-weiqiang opened this issue 2 years ago • 7 comments

Is there an already existing issue for this?

  • [X] I have searched the existing issues

Expected behavior

datawriter pool should be initialized normally

Current behavior

2022-07-11 14:33:48.281 [DATASHARING_PAYLOADPOOL Error] Failed to create segment fast_datasharing_01.0f.00.4b.1a.23.11.28.01.00.00.00_0.0.1f.3: Segment size is too large: 9605520720 (max is 4294967295). Please reduce the maximum size of the history -> Function init_shared_segment 2022-07-11 14:33:48.281 [RTPS_WRITER Error] Could not initialize DataSharing writer pool -> Function init 捕获

Steps to reproduce

we have participant that have 30+ datawriters, this problem will happen if all the port are initialized. However, if we delete half of the ports, this will not happen; It;s hard to provide all the port and datatype we use, but the amount of the wirter have effect on the chanses of this problem

Fast DDS version/commit

latest

Platform/Architecture

Other. Please specify in Additional context section.

Transport layer

Shared Memory Transport (SHM)

Additional context

No response

XML configuration file

No response

Relevant log output

No response

Network traffic capture

No response

shan-weiqiang avatar Jul 11 '22 06:07 shan-weiqiang

platform is x86-64 Ubuntu 18.04

shan-weiqiang avatar Jul 11 '22 06:07 shan-weiqiang

The reson is type overflow, see https://github.com/eProsima/Fast-DDS/blob/a000d613604d53481b775a193f176f6ef742dd2a/src/cpp/rtps/DataSharing/WriterPool.hpp#L171.

llapx avatar Jul 11 '22 07:07 llapx

I'm having a very similar problem. I'm using the osrf/ros:humble-desktop docker image and trying to use shared memory communication between two ROS 2 nodes in separate processes in the same machine.

I have the following XML profile

<?xml version="1.0" encoding="UTF-8" ?>
<profiles xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">

   <data_writer profile_name="default publisher profile" is_default_profile="true">
       <qos>
           <publishMode>
               <kind>ASYNCHRONOUS</kind>
           </publishMode>
           <data_sharing>
               <kind>AUTOMATIC</kind>
           </data_sharing>
       </qos>
       <historyMemoryPolicy>PREALLOCATED_WITH_REALLOC</historyMemoryPolicy>
   </data_writer>

   <data_reader profile_name="default subscription profile" is_default_profile="true">
       <qos>
           <data_sharing>
               <kind>AUTOMATIC</kind>
           </data_sharing>
       </qos>
       <historyMemoryPolicy>PREALLOCATED_WITH_REALLOC</historyMemoryPolicy>
   </data_reader>

</profiles>

and I'm setting export RMW_FASTRTPS_USE_QOS_FROM_XML=1

I'm using a 8Mb message.

  1. Trying my system using the version of ros-humble-rmw-fastrtps-cpp/now 6.2.1-2jammy.20220520.012804 I had the following logs resulting in a crash of the applications.
[37;1m2022-07-16 16:34:24.007 [31;1m[[37;1mDATASHARING_PAYLOADPOOL[31;1m Error] [37mFailed to create segment fast_datasharing_01.0f.70.b7.44.14.70.95.01.00.00.00_0.0.12.3: No such file or directory[34;1m -> Function [36minit_shared_segment[m
[37;1m2022-07-16 16:34:24.007 [31;1m[[37;1mRTPS_WRITER[31;1m Error] [37mCould not initialize DataSharing writer pool[34;1m -> Function [36minit[m
[37;1m2022-07-16 16:34:24.007 [31;1m[[37;1mRTPS_PARTICIPANT[31;1m Error] [37mA writer with the same entityId already exists in this RTPSParticipant[34;1m -> Function [36mcreate_writer[m
[37;1m2022-07-16 16:34:24.007 [31;1m[[37;1mDATA_WRITER[31;1m Error] [37mProblem creating associated Writer[34;1m -> Function [36menable[m
terminate called after throwing an instance of 'rclcpp::exceptions::RCLError'
  what():  could not create publisher: create_publisher() could not create data writer, at ./src/publisher.cpp:279, at ./src/rcl/publisher.c:116
  1. Updating libraries to the latest ros-humble-rmw-fastrtps-cpp/jammy,now 6.2.1-2jammy.20220620.175641 the situation improves: the application does not crash but I get the following warning, and the system does not seem to be using shared memory
[37;1m2022-07-16 16:24:57.720 [31;1m[[37;1mDATASHARING_PAYLOADPOOL[31;1m Error] [37mFailed to create segment fast_datasharing_01.0f.eb.7d.31.f4.32.1a.01.00.00.00_0.0.12.3: No such file or directory[34;1m -> Function [36minit_shared_segment[m
[37;1m2022-07-16 16:24:57.720 [31;1m[[37;1mRTPS_WRITER[31;1m Error] [37mCould not initialize DataSharing writer pool[34;1m -> Function [36minit[m
  1. If I change the message size to 1Mb, the above example with 1 pub and 1 sub works as expected, but on the other hand if I have 5 subscriptions (all in different processes) I get the following
[37;1m2022-07-17 17:09:30.902 [31;1m[[37;1mDATASHARING_PAYLOADPOOL[31;1m Error] [37mFailed to create segment fast_datasharing_01.0f.eb.7d.66.b4.9b.d0.01.00.00.00_0.0.12.3: No such file or directory[34;1m -> Function [36minit_shared_segment[m
[37;1m2022-07-17 17:09:30.902 [31;1m[[37;1mRTPS_WRITER[31;1m Error] [37mCould not initialize DataSharing writer pool[34;1m -> Function [36minit[m

Is it the same issue or a different problem? How to allow for a larger message size?

alsora avatar Jul 16 '22 16:07 alsora

@alsora

Is it the same issue or a different problem?

I think it's the same problem with shared memory.

I have tested with ros:humble and ros:rolling (both have the same problem), when you launch the docker with different size of shared memory, which will limit your amount of publishers/subscribers, here my docker launch script:

NAME=$1
IMG=$2
NET=host

docker run \
  -it \
  -d \
  --privileged \
  --shm-size=4G \
  --network $NET \
  --name $NAME \
  -v ${PWD}:/work \
  ${IMG} \
  bin/bash

How to allow for a larger message size?

enlarge your shm-size.

llapx avatar Jul 18 '22 07:07 llapx

First of all, sorry for the late answer.

Thank you @llapx for taking the time to answer this!

This issue has several questions which I will answer in separate comments.

MiguelCompany avatar Oct 11 '22 10:10 MiguelCompany

@shan-weiqiang The error you put in the image should always happen, independently of the writers or readers created. That is, it should always fail for the same data type and DataWriterQos.

If the data-sharing kind has been set to automatic() (which is the default), the creation will continue using a standard payload pool. If it has been set to on() the creation of the DataWriter will fail.

I suppose adding the topic name of the DataWriter being created to the error message would help to debug such situations.

In summary, to avoid the error message you are seeing, try to adjust the values of ResourceLimitsQosPolicy, reducing the max_samples value

MiguelCompany avatar Oct 11 '22 12:10 MiguelCompany

@alsora

  1. Trying my system using the version of ros-humble-rmw-fastrtps-cpp/now 6.2.1-2jammy.20220520.012804 I had the following logs resulting in a crash of the applications.

As you already checked, this was fixed on the rmw, and updating the package made it not crash.

  1. Updating libraries to the latest ros-humble-rmw-fastrtps-cpp/jammy,now 6.2.1-2jammy.20220620.175641 the situation improves: the application does not crash but I get the following warning, and the system does not seem to be using shared memory

Correct. As I just explained on my previous comment, setting data-sharing as AUTOMATIC does not use data-sharing if the pool cannot be created.

  1. If I change the message size to 1Mb, the above example with 1 pub and 1 sub works as expected, but on the other hand if I have 5 subscriptions (all in different processes) I get the following

Adding several subscription applications may imply additional data-sharing resources usage, due to the DDS topics automatically created by RCL / RMW. You could try changing your XML profile, so it only applies to the topic on which data-sharing should be used. You can find more information on how to do this here

How to allow for a larger message size?

Apart from the recommendation from @llapx, you can reduce the depth of your history, if that works for your application. Adjusting the max_samples value in ResourceLimitsQosPolicy would also be necessary for KEEP_ALL writers.

MiguelCompany avatar Oct 11 '22 12:10 MiguelCompany

According to our CONTRIBUTING.md guidelines, I am closing this issue due to inactivity. Please, feel free to reopen it if necessary.

Mario-DL avatar Mar 24 '23 08:03 Mario-DL