geometry2 icon indicating copy to clipboard operation
geometry2 copied to clipboard

Mitigate out of date transforms when rclcpp::spin() is slow to handle subscriptions

Open sloretz opened this issue 7 years ago • 3 comments

Currently tf2_ros::TransformListener has a hard coded queue depth of 100 for the /tf topic subscription. https://github.com/ros2/geometry2/blob/ec26237b2c27a3fe6b7ef44083aae271a2c742a0/tf2_ros/src/transform_listener.cpp#L81-L84

It also (optionally) has a dedicated thread to call rclcpp::spin(). New subscriptions seem to get messages in order with this QoS and fastrtps. When this thread cannot process subscription callbacks fast enough the transform listener's callback TransformListener::subscription_callback_impl is always being called with the oldest transform in the queue. However, there are 99 newer messages available in the queue. In the case of one tf publisher publishing at 100Hz the transforms looked up will always be 1 second out of date even though there is newer data in a lower layer. If tf2 is going to be missing messages then what it really wants the latest message in the queue.

The best solution is for rclcpp::spin to always call callbacks fast enough, but maybe there is something TransformListener could do to mitigate this case.

Say there was a maintenance task on a timer that changed the queue size by creating a new subscription, and then shutting the old one down when the new one started receiving messages. The new QoS history size could be calculated by from statistics about the transform sources. I think the sum of ( period of the transform broadcaster with the longest period between messages * the frequency of each transform broadcaster) is the minimum queue size needed to ensure no data is lost.

This would mitigate ros2/ros1_bridge#133, which appears to be caused by subscription callbacks not being processed fast enough in RViz2.

sloretz avatar Sep 14 '18 22:09 sloretz

I continue to have the same issue even using the last patched version.

You can replicate it running the dummy robot example:

  • Run the example: $ ros2 launch dummy_robot_bringup dummy_robot_bringup.launch.py
  • Open $ rviz2
  • Set the fixed frame to world
  • Add a TF plugin viewer. You will see the robot link moving
  • Add a LaserScan plugin and subscribe to /scan

At this point you will see the simulated laser scan on the oscillating link of the robot and you will notice how it is rendered in late respect to its reference frame. Furthermore you will notice a lot of errors Could not transform from [single_rrbot_hokuyo_link] to [world] in the LaserScan plugin and [ERROR] [rviz2]: Lookup would require extrapolation into the future. Requested time 1538495387.01342 but the latest data is at time 1538495386.98954, when looking up transform from frame [single_rrbot_hokuyo_link] to frame [world] in the console where RVIZ2 has been executed

image

Myzhar avatar Oct 02 '18 15:10 Myzhar

@Myzhar I'm not sure the issue you described is related to this ticket. Would you mind posting it as a new issue on https://github.com/ros2/rviz instead?

sloretz avatar Oct 08 '18 16:10 sloretz

@sloretz You are right.

I replicated the comment here https://github.com/ros2/rviz/pull/354

Myzhar avatar Oct 08 '18 21:10 Myzhar