motoros2 icon indicating copy to clipboard operation
motoros2 copied to clipboard

persistent "Final position was outside tolerance"

Open rsaintobert opened this issue 1 year ago • 16 comments

Anytime a trajectory is submitted I get the following error (with more or less joints incriminated), and the goal is aborted.

Final position was outside tolerance. Check robot safety-limits that could be inhibiting motion. [gp50_joint_1_s: 0.0000 deviation] [gp50_joint_6_t: 0.0000 deviation]

The log is done here but the 4-digits precision does not give enough info. Anyway the deviation is quite low here.

Click to expand:

https://github.com/Yaskawa-Global/motoros2/blob/9026a1d0b50b483f7c334bfd5ebbac8b328bfdb6/src/ActionServer_FJT.c#L563-L580

The problem is more in the check of the deviation:

Click to expand:

https://github.com/Yaskawa-Global/motoros2/blob/9026a1d0b50b483f7c334bfd5ebbac8b328bfdb6/src/ActionServer_FJT.c#L500-L520

Leaving the goal_tolerance parameter unset in the goal request should default to a tolerance of 0.01 which is not the case. Maybe the message buffer is badly initialized ?

Click to expand:

https://github.com/Yaskawa-Global/motoros2/blob/9026a1d0b50b483f7c334bfd5ebbac8b328bfdb6/src/ActionServer_FJT.c#L511-L512

Anyway, any value set in the goal request leads to the same result, so the user input is not taken properly into account.

rsaintobert avatar Apr 08 '24 16:04 rsaintobert

Could you please provide:

  • the motoros2_config.yaml as loaded on the controller
  • an example of a FollowJointTrajectoryAction goal which results in the problem you report (dumping the complete goal to stdout from your client program would be one way to do this)
  • a full debug log, from controller start up to and including a failed motion execution
  • information on which version of MotoROS2 this is, version of ROS 2, how you installed it, etc
  • information on your robot controller: type and robot configuration (a copy of the PANELBOX.LOG from your controller would be easiest)

(if easier: a minimal action client which submits a goal which results in the problem you report would also be ok -- and would make it easier to reproduce your problem)


Edit: and ideally: a Wireshark capture of the network traffic between your Agent and MotoROS2.

gavanderhoorn avatar Apr 08 '24 19:04 gavanderhoorn

I've edited the title of the issue to better reflect the symptoms observed.

We can't conclude yet what is the cause.

gavanderhoorn avatar Apr 09 '24 12:04 gavanderhoorn

Leaving the goal_tolerance parameter unset in the goal request should default to a tolerance of 0.01 which is not the case

I just noticed the == being used on a double here:

https://github.com/Yaskawa-Global/motoros2/blob/d563a79543956159b35e462cf7c49f8e3f9d2ec8/src/ActionServer_FJT.c#L505-L507

I'm wondering if posTolerance is some super-small value and technically non-zero. So, the comparison fails and the tolerance doesn't get set to default.

We should probably be using this value for the comparison instead:

https://github.com/Yaskawa-Global/motoros2/blob/d563a79543956159b35e462cf7c49f8e3f9d2ec8/src/MathConstants.h#L24-L25

ted-miller avatar Apr 09 '24 19:04 ted-miller

I've seen infinitesimal values for fields which were supposed to be 0.0 working on #226.

It's possible the (de)serialisation causes the field to no longer be exactly 0.0.

gavanderhoorn avatar Apr 09 '24 19:04 gavanderhoorn

Although thinking about it more: I'm curious to understand what makes this come up now in these cases, as the code implicated hasn't seen changes since MotoROS2 was first released.

gavanderhoorn avatar Apr 10 '24 06:04 gavanderhoorn

Could you please provide:

* the `motoros2_config.yaml` as loaded on the controller

* an example of a `FollowJointTrajectoryAction` goal which results in the problem you report (dumping the complete goal to `stdout` from your client program would be one way to do this)

* a [full debug log](https://github.com/Yaskawa-Global/motoros2/blob/b78ea310d6a5bb2b1c8f80ee0bf104c1986ca119/doc/troubleshooting.md#debug-log-client), from controller start up to and including a failed motion execution

* information on which version of MotoROS2 this is, version of ROS 2, how you installed it, etc

* information on your robot controller: type and robot configuration (a copy of the `PANELBOX.LOG` from your controller would be easiest)

(if easier: a minimal action client which submits a goal which results in the problem you report would also be ok -- and would make it easier to reproduce your problem)

Edit: and ideally: a Wireshark capture of the network traffic between your Agent and MotoROS2.

Here are the requested files: files.zip The version of ROS used is Humble.

Illumo-Vincent avatar Apr 11 '24 13:04 Illumo-Vincent

I have opened #236 to fix the comparison-check.

Here is a build to see if it has any impact on this issue. pr236.zip

ted-miller avatar Apr 11 '24 21:04 ted-miller

As @yai-rosejo noted in https://github.com/Yaskawa-Global/motoros2/discussions/235#discussioncomment-9072700, the path_tolerance and goal_tolerance fields aren't (always) populated.

The trajectory shared by @Illumo-Vincent (@rsaintobert?) has this:

 path_tolerance: []
 component_path_tolerance: []
 goal_tolerance: []
 component_goal_tolerance: []

without a check here on the length of the array:

https://github.com/Yaskawa-Global/motoros2/blob/d563a79543956159b35e462cf7c49f8e3f9d2ec8/src/ActionServer_FJT.c#L505-L507

we're most likely indexing into uninitialised memory.

This is UB in any case, but for some reason tends to fail for @rsaintobert / @Illumo-Vincent (and @TE-2 in #235) more than for other users, leading to the issue reported.

The change made by @ted-miller in #236 is a good one, but if we're indeed not sufficiently checking for goal_tolerance array length, not enough to deal with this problem I believe.


Edit: I've tried reproducing the issue with MoveIt, but don't seem to be able to. That doesn't mean it doesn't exist. I might just 'be lucky' and the memory being read happens to evaluate to a larger double than the per-joint delta.

gavanderhoorn avatar Apr 12 '24 13:04 gavanderhoorn

Looking into this more, it seems the current code doesn't parse the goal_tolerance part of the FJT goal correctly:

https://github.com/Yaskawa-Global/motoros2/blob/d563a79543956159b35e462cf7c49f8e3f9d2ec8/src/ActionServer_FJT.c#L505

It assumes all joints have a tolerance provided and the elements are ordered in the same way as the joints in the JointTrajectory.

In reality, we can't assume len(goal.trajectory.joint_names) == len(goal.goal_tolerance), and the ordering may also be different from that of goal.trajectory.joint_names. Each element in goal.goal_tolerance (of type control_msgs/JointTolerance) has a name field (here) which should be checked and used for finding the correct joint state to compare against.

@ted-miller: my suggestion would be to disable the final goal_tolerance check for now, provide @Illumo-Vincent / @rsaintobert and @TE-2 with a fixed build and in parallel work on correcting the implementation.

We have to keep in mind that in almost all cases, the Yaskawa controller will already have aborted the motion in case tracking error prevents it from faithfully executing the trajectory. The additional goal_tolerance check in MotoROS2 is wholly extra.

gavanderhoorn avatar Apr 12 '24 13:04 gavanderhoorn

I've submitted #237 for us to merge the work-around of disabling the broken check for now. It will always compare against the default joint position tolerance instead of attempting to use a user-provided value.

gavanderhoorn avatar Apr 12 '24 14:04 gavanderhoorn

@yai-rosejo has a fix which he's validating now. I don't think #237 is necessary.

ted-miller avatar Apr 12 '24 18:04 ted-miller

I have been working on a potential fix which can be found on my branch here: https://github.com/yai-rosejo/motoros2/tree/issue_233_outside_goal_tolerance On this branch I check the joint names and assign the goal position tolerances based off the names instead of just array indexing. I also allow support of the -1.0 option available for tolerances. My testing has shown that the values are being used correctly, but the Ros_Debug_BroadcastMsg is not able to print these values in a way that we would be expecting. image This picture shows that if I do a conditional print on a value of 2.0 or -1.0 that the value is correct. But that value is not able to be printed correctly, as shown by the multiple 0.000000s. I am including on here both the mr2_yrc1_h.out and the fjt_client.py that I have been using for testing. tolerance_issue_04122024.zip

To run these tests:

  1. Upload the mr2_yrc1_h.out to the controller and then restart the controller
  2. Run the microros client in a terminal : sudo docker run -it --net=host -v /dev:/dev --privileged microros/micro-ros-agent:humble udp4 -p 8888
  3. Start the TrajMode: ros2 service call /start_traj_mode motoros2_interfaces/srv/StartTrajMode
  4. Run the fjt client: python3 fjt_client.py
  5. Optionally the debug_listener can also be ran to capture messages.

yai-rosejo avatar Apr 12 '24 21:04 yai-rosejo

It seems we're fighting multiple issues.

  1. The tolerance arrays are not being properly validated for size and order. The fix from yai-rosejo should resolve this.

  2. The debug broadcasts are not printing doubles properly. We ran some tests that multiplied the values by 1000 and then cast as an int. The integer displays as expected. So we'll need to investigate vsnprintf behavior some more.

  3. I believe that Ros_Controller_IsInMotion() is not strict enough. Specifically START_MAX_PULSE_DEVIATION should be decreased. (Or make a new constant for "END deviation".) (Maybe make this configurable in yaml?)

A brief test indicated that the fb pos trails the cmd pos by about .0009 radians. If my calculations are right, for some axes (such as GP12 T axis) that's only 24 pulse counts. So, once the increment-queue is empty, Ros_Controller_IsInMotion would immediately return false, even though the robot is still moving.

I'm also suspicious that the difference between fb and cmd could shrink as the robot is decelerating. This might make it easier for other axes to also exceed the threshold.

  1. @yai-rosejo believes that feedback_FollowJointTrajectory.feedback.desired is not getting updated in the last cycle. This throws off the calculation for diff.

More testing is needed before we'll have an idea for PR.

ted-miller avatar Apr 16 '24 18:04 ted-miller

Could we open and merge PRs for each of the identified issues?

The fix for the incorrect parsing of the tolerances by @yai-rosejo seems like a good first step.

Regardless of whether any of the other issues get fixed, loading those tolerance entries would need to be corrected.

gavanderhoorn avatar Apr 16 '24 18:04 gavanderhoorn

@rsaintobert: with the merge of #241 this should now be fixed.

We'll have to release a new binary for this to be generally available though. Please let us know whether you'd like a prerelease build.

gavanderhoorn avatar Jun 18 '24 08:06 gavanderhoorn

Hello everyone,

I’m encountering the same issue where the robot moves successfully to the first waypoint (in Cartesian space) using the MoveIt Task Constructor (MTC), but then the following error occurs:

[move_group-4] [WARN] [1725660470.550437309] [moveit.simple_controller_manager.follow_joint_trajectory_controller_handle]: Controller 'follow_joint_trajectory' failed with error: unknown error: Final position was outside tolerance. Check robot safety limits that could be inhibiting motion. [group_1/joint_5: 0.0000 deviation] [move_group-4] [WARN] [1725660470.550522354] [moveit_ros.trajectory_execution_manager]: Controller handle follow_joint_trajectory reports status ABORTED [move_group-4] [INFO] [1725660470.550548538] [moveit_ros.trajectory_execution_manager]: Completed trajectory execution with status ABORTED ... [task_constructor_example-6] [ERROR] [1725660470.552712298] [moveit_task_constructor_executor_110906006630656]: Goal was aborted or canceled [task_constructor_example-6] [ERROR] [1725660470.562594942] [mtc_node]: Task execution failed

After reaching the first waypoint, I think the error is stopping the entire process, preventing the robot from continuing through the remaining waypoints to generate a complete trajectory.

Here are some relevant details:

  • OS: Ubuntu 22.04.4
  • ROS version: ROS2 Humble
  • Robot: Yaskawa GP4
  • Controller: YRC1000micro
  • MoveIt version: MoveIt 2
  • Framework: MoveIt Task Constructor (MTC)

What I’ve tried:

  • Lowered the tolerance in the ros2_controllers.yaml file, but the error persisted. I also tried including the settings in the moveit_controllers.yaml file (check the attached Zip folder).
  • Tried to disable the error check in the ActionServer_FJT.c file, but I couldn’t locate the file. I couldn't find any files or folders in the motoros2 repository; I don't even have the folder. After following the instructions on the motoros2 GitHub page, it wasn’t clear whether I needed to download or clone the main motoros2 repository, as it didn’t explicitly state that.

Here are my questions:

  • Is this issue specific to Yaskawa robots?
  • Has this been resolved?
  • Is my method for defining the tolerance correct?
  • How can I locate the ActionServer_FJT.c script to turn off the check?

Any help would be greatly appreciated.

Thank you, Serge

YAMLS.zip

SergeAlhalbi avatar Sep 11 '24 16:09 SergeAlhalbi