rclpy
rclpy copied to clipboard
draft: avoid race condition in action client
work in progress PR to address https://github.com/ros2/rclpy/issues/1123
Thank you for the contribution.
First, can you please retarget this to rolling? We'll review it there first, then we can consider backporting it to humble.
Second, can you please rebase your commits and add in a Signed-off-by line? The DCO bot has to pass for us to accept the commits.
Finally, can you please add some context to the initial description saying exactly what this is fixing? That would help us review it.
HI @clalancette - I'm just following up to say this PR is related to the bug described in https://github.com/ros2/rclpy/issues/1123, which I posted yesterday.
To summarize my bug report here: There are race conditions in the "async" functions on the ActionClient for sending goals, cancelling goals, and getting results. The core of the problem is that the underling _client_handle can sometimes finish its request before the sequence_number has been tracked in the _goal_sequence_number_to_goal_id (or any number of other dicts that are initialized for the purpose of tracking ongoing requests of different types).
The symptom ends up being that you have a request that never completes (blocking forever), and you see the following log in console:
Ignoring unexpected goal response. There may be more than one action server for the action '{ACTION_NAME}'
This PR seems to try out a solution where all of those dictionaries for tracking requests on the python side are protected by a lock. I think there might be multiple ways to tackle this.
Apologies for not linking the issue, I am currently still iterating on the fix and the PR is not ready yet. I have it directed to humble to test easily with my setup and planning to redirect to rolling if everything works and fix aligns with recommendation of the repo authors.
You might consider doing a squash merge of your development efforts, and then basing your PR off of the single commit.
You might consider doing a squash merge of your development efforts, and then basing your PR off of the single commit.
i do not think this is needed since that is own development branch, and we can do so when we take the PR to mainline.
Any thoughts on this proposal as a resolution? https://github.com/ros2/rclpy/issues/1123#issuecomment-1551836772
If anyone whose familiar with the ActionClient has opinions on their preferred resolution method, I'm happy to make a PR based off of them! This race condition affects humble up to current.
https://github.com/ros2/rclpy/pull/1308 replaces this PR.