rclpy icon indicating copy to clipboard operation
rclpy copied to clipboard

Race condition in Action Client service calls

Open jacobperron opened this issue 3 years ago • 1 comments

Like https://github.com/ros2/rclpy/issues/870, there is a similar race with action clients.

See discussion here: https://github.com/ros2/rclpy/pull/871#issuecomment-1010322339

jacobperron avatar Jan 14 '22 04:01 jacobperron

This is one is tricky. Technically, all three functions send_goal_async, _cancel_goal_async, and _get_result_async can cause a race condition. To address this, we would need to acquire a lock from self._client_handle.send_*_request until self.add_future(future).

Then we would need to acquire the same lock at for future in waitable._futures. This can be done by implementing a function waitable.get_futures which acquires the lock before returning the futures.

This is the first difficulty. Logically, this function should be implemented by the Waitable class which already implements the methods add_future and remove_future, and also initializes the _futures list. However, the lock is required by the subclass ActionClient which needs to acquire it for the aforementioned service calls. Two solutions:

  1. Waitable creates the lock and acquires it for get_futures and ActionClient also acquires it for the service calls.
  2. ActionClient implements get_futures.

This should avoid any race condition given the current implementation. However, this may not solve it in general. This solution would ensure that https://github.com/ros2/rclpy/blob/0bafc8ae187682ce8c62fe83da8be76badbf4abc/rclpy/rclpy/executors.py#L382 cannot run before all futures have been added to the _futures list. By extension, it also ensures that https://github.com/ros2/rclpy/blob/0bafc8ae187682ce8c62fe83da8be76badbf4abc/rclpy/rclpy/executors.py#L384 cannot run before all futures have been added to the _pending_*_requests dictionary. However, if the waitable.get_futures method is for some reason not called before waitable.execute then it could lead to a race condition and ultimately a KeyError.

augustelalande avatar Jan 15 '22 17:01 augustelalande