ros2_documentation icon indicating copy to clipboard operation
ros2_documentation copied to clipboard

Python Message Container Type Misinformation

Open InvincibleRMC opened this issue 5 months ago • 14 comments

Issue Type

  • [x] 🐛 Bug / Problem
  • [ ] ✏️ Typo / Grammar
  • [ ] 📖 Outdated Content
  • [ ] 🚀 Enhancement

Generated by Generative AI

No response

Distribution

All

Description

The documentation here claims that the container type for arrays/unbounded/bounded is builtins.list. But at run time it can be various other container types. E.j Sequence or set. This can matter for users since they cannot use list specific methods type safely e.j. append.

Affected Pages/Sections

No response

Screenshots or Examples (if applicable)

Image

Suggested Fix

Add a note about what container actually exist for each type.

Additional Context

No response

InvincibleRMC avatar Aug 14 '25 22:08 InvincibleRMC

But at run time it can be various other container types. E.j Sequence or set

Can you point to the corresponding code just to confirm this? Then a PR to clarify/fix the docs would be appreciated!

christophebedard avatar Aug 21 '25 17:08 christophebedard

Can you point to the corresponding code just to confirm this? Then a PR to clarify/fix the docs would be appreciated!

Just one example. Its honesly easiest to see by building test_interface_files/Arrays.msg

https://github.com/ros2/rosidl_python/blob/e4c419f880e266533297573931f4576bc368ae13/rosidl_generator_py/resource/_msg.py.em#L543-L547

InvincibleRMC avatar Aug 21 '25 18:08 InvincibleRMC

Interesting:

>>> import os
>>> os.environ['ROS_PYTHON_CHECK_FIELDS'] = '1'
>>> from std_msgs.msg import ByteMultiArray
>>> msg = ByteMultiArray()
>>> msg.data
[]
>>> msg.data = 'abc'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/christophe.bedard/ros2_ws/install/std_msgs/lib/python3.12/site-packages/std_msgs/msg/_byte_multi_array.py", line 182, in data
    ((isinstance(value, Sequence) or
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: The 'data' field must be a set or sequence and each value of type 'bytes'
>>> msg.data = [1, 2, 3]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/christophe.bedard/ros2_ws/install/std_msgs/lib/python3.12/site-packages/std_msgs/msg/_byte_multi_array.py", line 182, in data
    ((isinstance(value, Sequence) or
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: The 'data' field must be a set or sequence and each value of type 'bytes'
>>> msg.data = [b'a', b'b', b'c']
>>> msg.data
[b'a', b'b', b'c']
>>> msg.data = {b'a', b'b', b'c'}
>>> msg.data
{b'a', b'b', b'c'}
>>> msg.data.append(b'd')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'set' object has no attribute 'append'
>>> msg.data = [b'a', b'b', b'c']
>>> msg.data.append(b'd')
>>> msg.data
[b'a', b'b', b'c', b'd']

Yeah, it would be good to mention this!

christophebedard avatar Aug 22 '25 16:08 christophebedard

@christophebedard How does one actually test the python type_support? I want make sure sets actually work when turned from Python into C.

InvincibleRMC avatar Oct 15 '25 20:10 InvincibleRMC

good question 🤔 have you tried just creating a pub/sub pair with a message type with a field that can map to a set?

As for making sure this is covered by the tests, rosidl_generator_py's tests would be ideal (e.g., test_interfaces.py), but it's Python only. Python messages only get converted to C on publication and converted from C after they get taken.

Depending on the type, it might be covered by the rclpy tests or the tests in test_communication. I think test_communication runs pub/sub tests using all test messages (e.g., test_interface_files and test_msgs' own msgs). Looks like it properly checks that the received message is equal to the test message that the publisher retrieved from test_msgs: https://github.com/ros2/system_tests/blob/976a7a6021fa629ad97741969101685af386fe01/test_communication/test/subscriber_py.py#L28-L42.

In conclusion, check that something here uses Python sets, otherwise add it (and make necessary downstream changes in the actual tests): https://github.com/ros2/rcl_interfaces/blob/rolling/test_msgs/src/test_msgs/message_fixtures.py

christophebedard avatar Oct 15 '25 23:10 christophebedard

So there isn't any set() being used in any tests but, more concerning is that set order is not deterministic.

# Sometimes works, sometimes crashes
b = ["", "max value", "min value"]
a = set(b)

for i, j in zip(a, b, strict=True):
    print(f'list: {i} set: {j}')
    assert i == j

So sometimes python message will be equal each other but, sometimes not. I find it extremely hard to believe anyone is actually using sets. When I added it to system_tests the amount of failures changes on every run.

InvincibleRMC avatar Oct 16 '25 02:10 InvincibleRMC

I think there should probably be a discussion about the continued support of set. Either we need to implement/import some OrderedSet implementation to prevent data re-ordering. Or we need to deprecate/remove. Unfortunately the isinstance(value, Set) check has been around 8 years so someone could theoretical be using it even though it hasn't been working or listed in the documentation.

InvincibleRMC avatar Oct 16 '25 20:10 InvincibleRMC

sets are inherently not ordered, so that shouldn't matter when comparing them, no?

>>> set(['a', 'b', 'c']) == set(['b', 'a', 'c'])
True

christophebedard avatar Oct 16 '25 21:10 christophebedard

Yeah when comparing sets its fine but, when the message gets converted into C and back out again it uses PySequence_Fast. Which works by first checking if the incoming object is a list or tuple then returns them else if attempt a list cast of objects iterator. PySequence_Fast Implementation.

Since the incoming object is a set the calls look like. Which is non-deterministic since the iter method of sets in non-deterministic.

print(list(iter({'a', 'b', 'c'})))

Reference of one array field from _arrays_s.c

{  // bool_values
    PyObject * field = PyObject_GetAttrString(_pymsg, "bool_values");
    if (!field) {
      return false;
    }
    {
      PyObject * seq_field = PySequence_Fast(field, "expected a sequence in 'bool_values'");
      if (!seq_field) {
        Py_DECREF(field);
        return false;
      }
      Py_ssize_t size = 3;
      bool * dest = ros_message->bool_values;
      for (Py_ssize_t i = 0; i < size; ++i) {
        PyObject * item = PySequence_Fast_GET_ITEM(seq_field, i);
        if (!item) {
          Py_DECREF(seq_field);
          Py_DECREF(field);
          return false;
        }
        assert(PyBool_Check(item));
        bool tmp = (item == Py_True);
        memcpy(&dest[i], &tmp, sizeof(bool));
      }
      Py_DECREF(seq_field);
    }
    Py_DECREF(field);
  }

InvincibleRMC avatar Oct 16 '25 22:10 InvincibleRMC

Thanks for the explanation. To summarize, there are two things here:

  1. Ordering issue. By default, array fields are lists in Python, so users can call append() on them. Users can assign a set() to those fields, but the field is still supposed to be a "list" in the general sense (just look at the array types for C++), which is ordered. Therefore, I think this is "expected." However, we should clearly document this.
  2. Unclear type of "list" fields at runtime. Like you said, the real issue is that you can pass a message around (e.g., within a node) with a "list" field that's actually a set(), which might lead to an error at runtime (e.g., if you try to append() to a set())
    • We should at least document this too.
    • A proper fix is unclear. We probably shouldn't let users assign anything other than ordered containers (list, tuple). Or we can let them assign non-ordered containers, but convert them to lists (with a non-deterministic order) to avoid any runtime issues.

christophebedard avatar Oct 22 '25 18:10 christophebedard

As for where to document this, the section you point to seems to be the most logical place: https://docs.ros.org/en/rolling/Concepts/Basic/About-Interfaces.html#field-types

christophebedard avatar Oct 22 '25 18:10 christophebedard

   * We should at least document this too.
   * A proper fix is unclear. We probably shouldn't let users assign anything other than ordered containers (list, tuple). Or we can let them assign non-ordered containers, but convert them to lists (with a non-deterministic order) to avoid any runtime issues.

I definitely agree we should update the documentation here. For a proper fix I think we should either choose some standard container types like list or tuple (since these are more performant when using PySequence_Fast for example) or at the very least restrict it to the collections.abc.Sequence to continue support in case users wanted to keep the ability to pass in either custom classes or numpy arrays for examples. I think through static typing/runtime asserts we should not allow non-ordered containers at all. So if users want to use set they can do the cast themselves when passing into a message.

This discussion should maybe be moved elsewhere like rosidl_python since it has moved from the documentation itself and discussing implementation.

InvincibleRMC avatar Oct 22 '25 18:10 InvincibleRMC

I think through static typing/runtime asserts we should not allow non-ordered containers at all. So if users want to use set they can do the cast themselves when passing into a message.

I agree. We could start with a simple warning in Rolling for Lyrical and then make it an error in M-turtle.

This discussion should maybe be moved elsewhere like rosidl_python since it has moved from the documentation itself and discussing implementation.

Yep. We can keep this issue for the documentation changes. You can open a PR or issue against ros2/rosidl_python for the proposed code changes, and we can discuss that there (and potentially in a PMC meeting after ROSCon).

christophebedard avatar Oct 22 '25 18:10 christophebedard

While doing some other investigation I did finally found where it was documented lol. https://design.ros2.org/articles/idl_interface_definition.html

InvincibleRMC avatar Oct 24 '25 02:10 InvincibleRMC