Safety Check URCap with paused connection
First of all, I am not sure whether this is an issue or actual intended behaviour:
If I start the UR ROS2 control bringup, then run the URCap program on the actual robot I can control the robot via moveit/ros2_control without any problem.
But if I pause the URCap, manually move the robot to a different pose, and restart URCap, the robot moves to the last pose initiated by the controller really fast. Is this behavior intended? It's probably not the most common thing, to pause URCap and restart but it seems to be a big safety issue in my opinion if the connection is lost somehow.
This should definitely be considered a bug. For the ROS1 driver we took measures [1] and [2] to specifically avoid this, those seem not to have made it into the ROS2 driver.
I am not completely sure about the correct way to implement this in ROS2, but the idea is to reset all writing controllers if the control interface got lost. This (in ROS1) implicitly resets the command buffer to the current joint positions, so it will contain the new joint positions once going back active.
This can/should be done using hardware lifecycle so that you change state from active to inactive. This should be done by returning 'ERROR' in read or write and handle it then in on_error method. But to be fair there is currently missing mechanism in ros2_control to automatically unload controllers when this happens. We are currently working on internals to support this. Few steps/features are needed to do this nicely. I will keep you updated. The goal is to have this for humble.
I'll be in the lab tomorrow, where I can proof-of-concept that proposed (part) solution. At least the hazard motions should be stopped by that.
Edit: What I missed earlier: If we simply return ERROR from write, we're effectively back to #262 again.
Edit: What I missed earlier: If we simply return
ERRORfrom write, we're effectively back to #262 again.
Exactly, as I mentioned, we will need to add some more small features into ros2_control for this to work. The main part will come next week. So, you can test the concept. It basically misses the fourth (and probably the last) point from this PR comment.
Yesterday, the last missing piece to do recovery manually landed in the ros2_control's main branch. So the approach to handle this is the following:
- Hardware returns error on read or write.
- The error is handled in
on_errorcallback. - The hardware is in
UNCONFIGUREDstate. - Then the controllers should be deactivated manually. (JointStateBroadcaster probably not, but you can also do it).
- Then call service
set_hardware_statefrom controller manager to activate hardware again. - Then activate controllers again.
I am currently preparing a detailed example about this, but it will take a few weeks to get it finished.
Please also note the following potential issues:
- maybe you will have to export your command and state interfaces again. Try to use the same pointers as before. Nevertheless, this could confuse ControllerManager and the above process doesn't go smoothly.
- If you don't re-activate JointStateBroadcaster it could happen that interfaces are not connected correctly anymore and the values are not reported properly.
Hello again,
I have done some experiments with the galactic branch (debian build 2.1.2 using joint_trajectory_controller adaptions) and our UR5:
I have tested:
- running the driver using ur_bringup - ur_control_launch.py ur_type:=ur5
-
Pausing external control, move the robot with freedrive, restart external control --> All good, the robot stays where it is and the driver can be used (This was a problem when I opened this issue)
-
Test joint_trajectory_controller, pausing the external control connection in the middle of a trajectory, stop the publisher and restart external control: ---> Fast unsafe movement to goal pose
-
Same as 2. but using moveit: --> Same result as 2
I can not quite follow the complete controller logic and changes on the UR side but before I dig deeper, I thought I share my experiments. I thought it might not yet be supported with galactic but the odd thing was that the same experiment on our UR10 resulted in no problems (but this might also be a hardware controller difference). Am I missing something here? Is there anything I can do and or check to have this fixed?
I upgraded to Polyscope 3.15 and running the latest external control version 1.5, ros2 control is at 1.6.0
I will check this with a humble build as well asap
The behavior described in https://github.com/UniversalRobots/Universal_Robots_ROS2_Driver/issues/279#issuecomment-1225867070 is actually one of the reasons why we developed the scaled version. The non-scaled JTC is basically a feed forward controller that is not aware of the robot's state.
Pausing the program is similar to moving the speed slider to 0. The scaled JTC will not move forward while being paused, the non-scaled version does. Stopping the publisher won't abort a trajectory running inside the controller.
As soon as you move the robot using freedrive (or the buttons on the TP) the external control program gets stopped and then the controller_stopper actually stops the program.
To prevent this I suggest to use the scaled JTC instead of the non-scaled version. I could not imagine why the UR5 should behave different to a UR10, though.