Introduce managed transform buffer to geometry2
Recently I was trying to address issue for high CPU load in runtime with many TF listeners in ROS graph. As a result I developed a wrapper for TF listener & buffer, please see https://github.com/autowarefoundation/ManagedTransformBuffer.
We already have static TF listener in geometry2, but Managed Transform Buffer acts as dynamic / static listener automatically. It's convenient approach for users if input frames are defined as parameters. Moreover, in case of static TFs requests only, there is no dangling tf_listener_impl_xxxxxxxx node - we use local buffer, TF listening happens only for first TF pairs requests.
There is one intuitive assumption. Once static TF requested, transform is constant. This TF will already exists in local buffer and listener is not initialized anymore (note: we can add external local buffer reset via service for users if needed).
Implementation handles two nodes. If we could place it in geometry2 with tiny API update, we might reduce it to a single node. To make it true, we need to:
- Fill local map with incoming transforms somewhere here https://github.com/ros2/geometry2/blob/24a8b9a1433220775c186cc7ef556972c6ca5402/tf2/src/buffer_core.cpp#L274
- Update user TFs requests history (whether static only so far or not) somewhere here https://github.com/ros2/geometry2/blob/24a8b9a1433220775c186cc7ef556972c6ca5402/tf2/src/buffer_core.cpp#L851 This include traverse algorithm over tf tree to see if all TFs between
target_frame->source_frameareis_static. - Add new public method to retrieve information e.g.
hasStaticTFsRequestsOnly(). - Rewrite new class for Managed Transform Buffer without extra features (e.g. eigen repr, transforming pointcloud etc.) and place it in tf2_ros.
Overhead is negligible, benefits visible. You can also see performance test for one of our component related issue. My motivation for this contribution is fact that ROS community values more packages maintaining rather than new packages. Before starting this contribution, I would like to know if these changes within tf2 are acceptable for rolling and possibly backporting to humble & jazzy.
Hi @amadeuszsz,
Thanks for your valuable suggestions to improve TF2.
With a more complete proposal we can introduce the change into Rolling. I've added this ticket to the agenda for the next ROS 2 PMC meeting to get a consensus from other core developers.
Hi, after reviewing the PR and related tickets I don't think that this is a good change.
- As you pointed out, there's already a way to subscribe only to the
tf_statictopic. You mention that this still requires the creation of an "hidden" node, but that's not true: tf2 listeners constructors can take node interfaces as input so the correct way to construct a listener is to pass the interfaces from the parent node. - The behavior of this new listener could be hard to understand: initially it behaves as a static listener, but it's sufficient to request a dynamic tf once that it will start behaving as a dynamic one. This will make it harder to understand its resource utilization compared to the two separate tf listeners. Users that are not aware of this subtle implementation detail may end up "activating" the dynamic listener without knowing it.
- minor: the name "managed" should be reserved to ROS 2 lifecycle entities
I think that users that are cautious about performance should also be explicit in what they are doing and create separate listeners if they only need static or dynamic tfs.
I acknowledge that there's a problem in resource utilization due to the tf topics. This can be mitigated with a better executor implementation and by taking advantage of nodes composition.
In particular, considering node composition, you could also have a global tf listener shared by all nodes in the same process thus reducing CPU usage a lot (assuming that number of nodes >> number of processes)
Looking through this it's a neat way to partially reduce the bandwidth consumption. However I think that it's a bit to fragile and it adds complexity which makes things harder to document and explain. I'd like to focus on reducing CPU usage overall. The main cause of high CPU usage is that TF was designed to provide minimum latency queries on any transform in the system. As such the default is that every node subscribes to the full stream and maintains their own buffer. However there are a lot of use cases where you don't need super low latency and to resolve that we have a standard tf2::Server: https://github.com/ros2/geometry2/blob/rolling/tf2_ros/include/tf2_ros/buffer_server.h which can be embedded in any node (or run as a standalone https://github.com/ros2/geometry2/blob/rolling/tf2_ros/src/buffer_server_main.cpp ) which provides an interface to be queried from other nodes.
We've provided both a C++ https://github.com/ros2/geometry2/blob/rolling/tf2_ros/include/tf2_ros/buffer_client.h and python client https://github.com/ros2/geometry2/blob/rolling/tf2_ros_py/tf2_ros/buffer_client.py which can point at any TF BufferServer instance and query them remotely. This is designed for systems that only need to query tf periodically and would rather not incur the cost of maintaining their own buffer. The most common use case for this is a background script or process that is monitoring conditions at low frequency. When you have many of these especially in python this will save a lot of CPU cycles.
If you know that you're only querying static values from the buffer, then you can easily query them remotely and cache them.
Can you give us some detail about the cause of the high CPU load ? Our finding was, that every tf subscriber gets hammered with callbacks, in extreme cases in the kHz range. This was in our case the main cause of high CPU load. We found that one of the causes was bad usage of the TransformBroadcaster. There is an API so send each transformation individually, and one, to send it as bulk (https://github.com/ros2/geometry2/blob/00e182217b3d0f668adb639923b6cc4e2ddfda59/tf2_ros/include/tf2_ros/transform_broadcaster.h#L116). Using the bulk send reduced the callback CPU load massively for us. If latency is not a big issue, a second option is, to have aggregator nodes, that collect transformations from several devices, and broadcasts them using the bulk API.
This issue has been mentioned on ROS Discourse. There might be relevant details there:
https://discourse.ros.org/t/ros-pmc-meeting-minutes-2025-01-18/42127/1
For ROS 1, we've developed a slightly different solution: https://github.com/peci1/tf2_server/tree/master .
This is a central node that subscribes to all TFs (as usual) and allows definitions of subtrees. Each subtree creates a pair of topics in a namespace where only the relevant TFs are published. The clients can then subscribe to these namespaced topics instead of /tf+/tf_static to get only the subset of TFs they need for their operation.
I've made the server a bit more complicated by providing a runtime service that can dynamically create the subtrees (in addition to static definition of subtrees on startup).
From my point of view - if you're only interested into static TFs, then remap /tf to /nonexistent.
Thank you for all valuable comments!
I'm linking draft with ported feature to rolling for record of this discussion purposes https://github.com/ros2/geometry2/pull/760.
The idea behind Managed TF Buffer is switching between static and dynamic listener on purpose. I'm not considering this as a disadvantage, but rather just an implicit approach - as an opposition for currently available explicit methods of handling TFs. On the other hand I fairly understand that this is not what might follow TF Buffer design and it can break its architecture or it just not aligns with how maintainers see the future of this project.