Fast-DDS icon indicating copy to clipboard operation
Fast-DDS copied to clipboard

Sequences and Arrays of Nested Structs Cause Segfault on Pub-Sub for Dynamic Types (with minimal examples)

Open methylDragon opened this issue 2 years ago • 1 comments

⚠️ This is a blocker for demoing a ROS 2 REP reference implementation including FastDDS for ROSCon 2022 ⚠️

We just need a fix in-source (no need for a release yet) to do the demo. The trouble is, I think the issue is kinda inside the guts of FastDDS...

Is there an already existing issue for this?

  • [X] I have searched the existing issues

They exist for earlier versions of FastDDS (2.3.X), it seems the issue still persists in the current version.

Expected behavior

  1. Create a dynamic type consisting of a sequence/array of nested structs
  2. Pass it to a TypeSupport
  3. Register the TypeSupport

Current behavior

  1. Create a dynamic type consisting of a sequence/array of nested structs
  2. Pass it to a TypeSupport
  3. Register the TypeSupport <--- This segfaults

This issue occurs for empty sequences, sequences, and arrays.

Steps to reproduce

I've prepared a minimal example: https://github.com/methylDragon/fastdds-pr-minimal-examples

Once built, run:

cd build

./src/nested-sequences-bug/empty_sequence_of_nested_structs
./src/nested-sequences-bug/sequence_of_nested_structs
./src/nested-sequences-bug/array_of_nested_structs

The relevant code can be found here.

The types can be constructed, dynamic data can be created, and dynamic data is printable, it's just the registration of the type support that breaks for some reason.

Fast DDS version/commit

FastDDS 2.8.x: https://github.com/eProsima/Fast-DDS/commit/78473a0de3d878a71565ee5d6c8e1e01c34beb70

The examples workspace is using that specific SHA in the submodule, for convenience.

Platform/Architecture

Other. Please specify in Additional context section.

Transport layer

Default configuration, UDPv4 & SHM

Additional context

This bug is hampering the progress of the dynamic types and type introspection reference implementation for ROS 2 as proposed in REP 2011, to be presented in ROSCon 2022. Fixing it will expedite the progress for completing the demo before the talk.

Without the bug being fixed, a vast majority of use cases in ROS 2 will not be possible (sequences of nested message types.) So this is quite critical.

This is on 22.04, but I don't think the platform matters in this case.

I've done some tracing, the segfault occurs in both of these lines: here and here.

I don't know why it occurs. Though I suspect it's similar to these issues:

  • https://github.com/eProsima/Fast-DDS/issues/1243
  • https://github.com/eProsima/Fast-DDS/issues/1260
  • https://github.com/eProsima/Fast-DDS/issues/2184

XML configuration file

No response

Relevant log output

nested_array: 
	[0] = <struct/bitset>
		inner_uint32: 0

	[1] = <struct/bitset>
		inner_uint32: 0

[OK BEFORE REGISTERING TYPE]
Segmentation fault (core dumped)


### Network traffic capture

_No response_

methylDragon avatar Sep 22 '22 22:09 methylDragon

Just gonna ping @jparisu since they were engaging with a closely related problem in the listed issues

methylDragon avatar Sep 22 '22 23:09 methylDragon

I have opened a new issue explaining in detail the cause of the crash and its different manifestations in order to keep track in a single issue of the same bug. I am closing this issue in favor of #3296.

JLBuenoLopez avatar Feb 14 '23 09:02 JLBuenoLopez