ros2_controllers [JTC] Fix race condition & interpolation bug

[JTC] Fix race condition & interpolation bug

Open mechwiz opened this issue 2 years ago • 0 comments

This PR addresses a few bugs:

Race condition - Ideally if a new goal comes in from the action server, we should see a different new_external_msg when reading from traj_msg_external_point_ptr_.readFromRT(); and an active goal when reading from *rt_active_goal_.readFromRT(); within the same update cycle. However I noticed based on a project I'm working on that after a new goal comes in, sometimes when traj_msg_external_point_ptr_.readFromRT(); is called, it still returns the old trajectory for a cycle (most likely since the new one is still being written to the RT thread from the nonRT thread. However, because *rt_active_goal_.readFromRT(); is called later within the update cycle, it actually returns that there is an active goal causing the JTC to return success immediately since the trajectory still cached within the JTC has indeed been completed (now for a second time). See log output below. Note I added the "new message received" log to note when the new_external_msg within the update callback had actually been changed. That log should always occur between the logs for "Goal request accepted!" and "Goal reached, success" but here, you can see it occurred after both. Another sanity check is that the time difference between "Goal request accepted!" and "Goal reached, success!" is super small (like less than 0.1ms) implying that the goal returned success immediately.

[ros2_control_node-9] [INFO] [1656865754.199093651] [RightArm.position_traj_controller]: Received new action goal
[ros2_control_node-9] [INFO] [1656865754.199622192] [RightArm.position_traj_controller]: Accepted new action goal
[move_group-1] [INFO] [1656865754.199809730] [moveit.simple_controller_manager.follow_joint_trajectory_controller_handle]: /RightArm/position_traj_controller started execution
[move_group-1] [INFO] [1656865754.199831511] [moveit.simple_controller_manager.follow_joint_trajectory_controller_handle]: Goal request accepted!
[ros2_control_node-9] [INFO] [1656865754.200516294] [RightArm.position_traj_controller]: Goal reached, success!
[ros2_control_node-9] [WARN] [1656865754.203480867] [RightArm.position_traj_controller]: new message received
[move_group-1] [INFO] [1656865754.204146901] [moveit.simple_controller_manager.follow_joint_trajectory_controller_handle]: Controller '/RightArm/position_traj_controller' successfully finished
[move_group-1] [INFO] [1656865754.212873413] [moveit_ros.trajectory_execution_manager]: Completed trajectory execution with status SUCCEEDED ...

This PR therefore moves when we read from *rt_active_goal_.readFromRT(); to be at the same point in time as when we read from traj_msg_external_point_ptr_.readFromRT(); (which makes sense I think just in general in that it's probably good practice to query all relevant values from the RT thread at the same point in time). This fixes the race condition since I did not see this problem come up again when testing my project.

Due to moving when we query *rt_active_goal_.readFromRT(); to be earlier than it originally was in the update callback I noticed that some of the JTC tests were failing. Turns out it was because after a new goal came in, new_external_msg had been updated but the active_goal had not yet been (though it would have if it was called later in the code in the same cycle). To handle this, I added a check to see if there is a mismatch between when querying the active goal status earlier in the code and later in the code and if so, to just skip that cycle. This might be able to be handled better by checking to see that if there is a mismatch, to see if the new_external_msg had also been updated that cycle and if so, update the active_goal status and carry out the rest of the cycle.
There is an interpolation bug that occurs in the following case:

Open loop control (so the last_commanded_state_ is prepended when a new trajectory comes in)
Continuous joint on hardware that returns values between -PI and PI and doesn't wrap around (i.e. does not give feedback above PI or below -PI)

This can lead to cases where the commanded state can be -PI but the feedback the joint gives is PI due to encoder resolution. When a new trajectory comes, it usually is based on the feedback (so PI in this case). When the trajectory is prepended with the last commanded state, you then can encounter the JTC trying to interpolate between -PI and PI really really quickly causing faults on the hardware. This is because interpolation does not currently take into account the shortest-angle. This PR fixes this.

Aug 10 '22 13:08 mechwiz

ros2_controllers ros2_controllers copied to clipboard

[JTC] Fix race condition & interpolation bug

ros2_controllers
ros2_controllers copied to clipboard