rclpy icon indicating copy to clipboard operation
rclpy copied to clipboard

draft: avoid race condition in action client

Open aditya2592 opened this issue 2 years ago • 6 comments
trafficstars

work in progress PR to address https://github.com/ros2/rclpy/issues/1123

aditya2592 avatar May 04 '23 02:05 aditya2592

Thank you for the contribution.

First, can you please retarget this to rolling? We'll review it there first, then we can consider backporting it to humble.

Second, can you please rebase your commits and add in a Signed-off-by line? The DCO bot has to pass for us to accept the commits.

Finally, can you please add some context to the initial description saying exactly what this is fixing? That would help us review it.

clalancette avatar May 04 '23 14:05 clalancette

HI @clalancette - I'm just following up to say this PR is related to the bug described in https://github.com/ros2/rclpy/issues/1123, which I posted yesterday.

To summarize my bug report here: There are race conditions in the "async" functions on the ActionClient for sending goals, cancelling goals, and getting results. The core of the problem is that the underling _client_handle can sometimes finish its request before the sequence_number has been tracked in the _goal_sequence_number_to_goal_id (or any number of other dicts that are initialized for the purpose of tracking ongoing requests of different types).

The symptom ends up being that you have a request that never completes (blocking forever), and you see the following log in console:

Ignoring unexpected goal response. There may be more than one action server for the action '{ACTION_NAME}'

This PR seems to try out a solution where all of those dictionaries for tracking requests on the python side are protected by a lock. I think there might be multiple ways to tackle this.

apockill avatar May 04 '23 21:05 apockill

Apologies for not linking the issue, I am currently still iterating on the fix and the PR is not ready yet. I have it directed to humble to test easily with my setup and planning to redirect to rolling if everything works and fix aligns with recommendation of the repo authors.

aditya2592 avatar May 11 '23 00:05 aditya2592

You might consider doing a squash merge of your development efforts, and then basing your PR off of the single commit.

dcconner avatar May 17 '23 21:05 dcconner

You might consider doing a squash merge of your development efforts, and then basing your PR off of the single commit.

i do not think this is needed since that is own development branch, and we can do so when we take the PR to mainline.

fujitatomoya avatar May 17 '23 22:05 fujitatomoya

Any thoughts on this proposal as a resolution? https://github.com/ros2/rclpy/issues/1123#issuecomment-1551836772

If anyone whose familiar with the ActionClient has opinions on their preferred resolution method, I'm happy to make a PR based off of them! This race condition affects humble up to current.

apockill avatar May 26 '23 16:05 apockill

https://github.com/ros2/rclpy/pull/1308 replaces this PR.

fujitatomoya avatar Jun 28 '24 19:06 fujitatomoya