rclcpp icon indicating copy to clipboard operation
rclcpp copied to clipboard

ROS 2 services are not 1 to 1 communication: responses are copied and sent to all clients

Open mauropasse opened this issue 1 year ago • 1 comments
trafficstars

Issue report

Currently if there are 2 (or more) clients with the same /service_name, both will get the server response (one will silently discard it after noticing it wasn't for him).

This has CPU & memory performances implications - in many cases server responses can be quite big - thus unnecessary copies and de-serialization of responses happen, while there could be multiple clients with same name (not uncommon situation).

I verified this problem with rmw_cyclonedds & rmw_fastrtps , on current rolling, humble & galactic.

  // Example
  auto service = node1->create_service<SetBool>("srv_name");
  auto client_1 = node2->create_client<SetBool>("srv_name");
  auto client_2 = node3->create_client<SetBool>("srv_name");

  client_1->async_send_request(...);
  
  // Spin service, so it will get the request and produce a response.
  // The response is received in both clients - one will silently discard it

A full example can be found here, where I'm also using a service message with an 8MB request / 8MB response to better appreciate the performance consequences.

With the mentioned example using the mentioned service, I can compute a CPU FlameGraph showing the difference between having a single client vs having 2 clients with same /service_name, where we can see the extra CPU time overhead:

CPU_FlameGraph_srv_cli_double_delivery

In a one-to-one client-server communication model, a client sends a request to the server, and the server should respond to that specific client.

It'd be nice to find a way to comply with this model. Maybe the Service can store the Client's endpoint, and repond only to it.

Note that this issue doesn't happen if we use intra-process communication between clients & services, where the service only responds to the client who made the request.

The issue was noticed thanks to a log I added in the past, which would be nice to include in rolling:

[ERROR] [1699640383.717414288] [rclcpp]: Error in take_type_erased_response: RCL_RET_CLIENT_TAKE_FAILED. Service name: /persistent_params/list_parameters

mauropasse avatar Dec 22 '23 14:12 mauropasse

@mauropasse @alsora

this performance? or efficiency issue has known for a long time, thanks! this brings back memory to me.

similar things are described in https://github.com/ros2/design/blob/918c09758ed4c0854aa128b9c8ed0051c21a6590/articles/content_filtering.md#problems for development for Content Filtering Subscription (https://github.com/ros2/design/pull/282, :sweat: almost 4 years ago...)

originally, we were going to enhance the ROS 2 system /parameter_events action and service with this new feature, but it stopped on https://github.com/ros2/rcl/issues/982. probably we can reconsider how we want to proceed for this issue, including https://github.com/ros2/rcl/issues/982 is required or not. (Note, https://github.com/ros2/rcl/issues/982 does not provide any performance improvement nor efficiency, it is just concealment to provide compatible user application experience, which is also good thing.)

I think your problem here can be solved above approach, but if your requirement is really 1 to 1 server:client architecture, that is NOT something Content Filtering Subscription provides.

fujitatomoya avatar Jan 04 '24 22:01 fujitatomoya