navigation2
navigation2 copied to clipboard
Running TB3 simulation with some namespace won't work with composition nodes
When I running Multi-TB3 simulation with use_composition:=True
(default), Nav2 stack does not start:
[leha@leha-PC nav2_ws]$ ros2 node list
/gazebo
/launch_ros_4468
/robot1/intel_realsense_r200_depth_driver
/robot1/nav2_container
/robot1/robot_state_publisher
/robot1/rviz
/robot1/transform_listener_impl_7fe25c0109c0
/robot1/turtlebot3_diff_drive
/robot1/turtlebot3_imu
/robot1/turtlebot3_joint_state
/robot1/turtlebot3_laserscan
/robot2/intel_realsense_r200_depth_driver
/robot2/nav2_container
/robot2/robot_state_publisher
/robot2/rviz
/robot2/transform_listener_impl_7fc7e4001800
/robot2/turtlebot3_diff_drive
/robot2/turtlebot3_imu
/robot2/turtlebot3_joint_state
/robot2/turtlebot3_laserscan
When disabling it, everything works fine.
Bug report
Required Info:
- Operating System:
- Ubuntu 20.04
- ROS2 Version:
- ROS2 Rolling built from sources
- Version or commit hash:
- 42d0bbd026f84fb6447982f4448ea5c51f6fdebb
- DDS implementation:
- CycloneDDS
Steps to reproduce issue
ros2 launch nav2_bringup multi_tb3_simulation_launch.py autostart:=True
Expected behavior
Nav2 stack starts
Actual behavior
Nav2 main nodes do not start
Additional information
It turns out, that running tb3_simulation_launch.py
with some non-empty root namespace works only with disabled composition nodes. If enable use_composition
parameter for that case, no nodes from localization_launch.py
and navigation_launch.py
(map_server, amcl, controller, smoother, etc...) are starting. This appears without any error or warning messages in the console: fail.log
It would have to be run in different containers that the containers have the namespaces in -- or set the node namespaces in the composition load commands, I believe (https://docs.ros.org/en/foxy/Tutorials/Intermediate/Composition.html). Or, we can remove the option for composition for multiple robot bringup and hardcode it in the launch file to be non-composed.
I've run into this issue and wanted to try Compositions for my robot. I've made a temporary fix for the problem (not sure if it's proper).
Added the following code to the normal navigation_launch.py file
...
use_respawn = LaunchConfiguration('use_respawn')
package = LaunchConfiguration('package')
executable= LaunchConfiguration('executable')
...
Then I declare them and create a node to hold the container.
...
declare_package_cmd = DeclareLaunchArgument(
'package', default_value='rclcpp_components',
description='the package that contains the component container executable')
declare_executable_cmd = DeclareLaunchArgument(
'executable', default_value='component_container',
description='the name of the component container executable')
container = GroupAction(
condition=IfCondition(use_composition),
actions=[
Node(
package=package,
executable=executable,
name=container_name,
output='both',
parameters=[configured_params],)
],
)
...
Finally, I add the declarations and new container node.
...
ld.add_action(declare_container_name_cmd)
ld.add_action(declare_use_respawn_cmd)
ld.add_action(declare_package_cmd)
ld.add_action(declare_executable_cmd)
# Add the actions to launch all of the navigation nodes
ld.add_action(load_nodes)
ld.add_action(container)
ld.add_action(load_composable_nodes)
return ld
There's already a component container in bringup_launch.py: https://github.com/ros-planning/navigation2/blob/main/nav2_bringup/launch/bringup_launch.py#L120-L128 so that we can add in both Nav2 and localization components to a main container.
What about these changes actually enables the namespacing? Is it just the declaration of the parameters? Why would moving the container here make a difference?
You also swapped the type of component container to a single threaded component container and not the isolated single threaded component container. That is going to have significant impacts on your performance. Is that the difference?
I'm not sure what's proper/improper since I'm not sure which of these changes is what enabled it :disappointed_relieved:
Ah I haven't used the nav2_bringup script. I have my localization separate and don't do slam.
Just launching navigation_launch.py doesn't work with compositions, which is the reason for the mod I made. I understand the intended workflow now though :) Everyone can disgard my comment. This also is not for TB3, but just my own situation.
Well nothing here should be special about TB3 for these launch files, beyond the model that is run in gazebo (and I guess the parameter file values, to some degree). It should generalize to any robot easily.
Got it - you just didn't have a component container anywhere yet!
The problem appears due to incorrect composition container (starting in the bringup_launch.py
) name handling. Composable nodes do not not address this name relatively to their namespace, so we need to forcibly set it manually. Something like:
diff --git a/nav2_bringup/launch/bringup_launch.py b/nav2_bringup/launch/bringup_launch.py
index 8fddcc32..5e987750 100644
--- a/nav2_bringup/launch/bringup_launch.py
+++ b/nav2_bringup/launch/bringup_launch.py
@@ -144,7 +144,7 @@ def generate_launch_description():
'params_file': params_file,
'use_composition': use_composition,
'use_respawn': use_respawn,
- 'container_name': 'nav2_container'}.items()),
+ 'container_name': (namespace, '/', 'nav2_container')}.items()),
IncludeLaunchDescription(
PythonLaunchDescriptionSource(os.path.join(launch_dir, 'navigation_launch.py')),
@@ -154,7 +154,7 @@ def generate_launch_description():
'params_file': params_file,
'use_composition': use_composition,
'use_respawn': use_respawn,
- 'container_name': 'nav2_container'}.items()),
+ 'container_name': (namespace, '/', 'nav2_container')}.items()),
])
# Create the launch description and populate
It works for me. But further, another problem appears: composable nodes are not reading ROS-parameters from input YAML-file. On the controller_server
, we the following message is being observed:
[component_container_isolated-6] [INFO] [1672229364.645884363] [robot1.controller_server]: Created controller : FollowPath of type dwb_core::DWBLocalPlanner
[component_container_isolated-6] [INFO] [1672229364.657283256] [robot1.controller_server]: Setting transform_tolerance to 0.100000
...
[component_container_isolated-6] [ERROR] [1672229364.715675495] [robot1.controller_server]: Couldn't load critics! Caught exception: No critics defined for FollowPath
[component_container_isolated-6] [ERROR] [1672229364.723367376] []: Caught exception in callback for transition 10
[component_container_isolated-6] [ERROR] [1672229364.723396968] []: Original error: Couldn't load critics! Caught exception: No critics defined for FollowPath
[component_container_isolated-6] [WARN] [1672229364.723431018] []: Error occurred while doing error handling.
[component_container_isolated-6] [FATAL] [1672229364.723458480] [robot1.controller_server]: Lifecycle node controller_server does not have error state implemented
[component_container_isolated-6] [ERROR] [1672229364.724089552] [robot1.lifecycle_manager_navigation]: Failed to change state for node: controller_server
[component_container_isolated-6] [ERROR] [1672229364.724126749] [robot1.lifecycle_manager_navigation]: Failed to bring up all requested nodes. Aborting bringup.
<- this is because ROS-parameters are not being set from the YAML-file for some reasons (transform_tolerance should be 0.200000, 0.1 is default value).
This is a separate problem that should be fixed in a pair with this issue. New bug #3376 was opened on this effect.
Composable nodes do not not address this name relatively to their namespace,
That seems like a reasonable solution!
Or, what if we set the namespace here instead? Either way, but I think that option would also work.
Its probably worth filing a bug that including launch files doesn't push the namespace even though its under it https://github.com/ros-planning/navigation2/blob/main/nav2_bringup/launch/bringup_launch.py#L113-L115 for the composible node functions (or is this even having to do with including other launch files? If we have the push namespace and load composable node in the same PushNamespace, does it still not work?). It worth reporting this issue.
W.r.t. parameter issues: the namespace has to exist in the yaml file instead if you're not using the remapping yaml functions specific to nav2 for dealing with that automatically.
The option to set the namespace in the navigation_launch.py will also work. We also need in this case also do this for localization_launch.py
as well. However, the meaning of container_name
as input parameter for these scripts will be preserved, which is suitable for further scripts maintenance. Thank you for the suggestion, I'll try to arrange this properly.
It worth reporting this issue.
I think so: at least it seems strange that container does not take the root namespace where it should be launched.
W.r.t. parameter issues: the namespace has to exist in the yaml file instead if you're not using the remapping yaml functions specific to nav2 for dealing with that automatically.
For this case, both testing nodes are intended to be run in the same namespace, so we are already setting the root namespace through the RewrittenYaml()
routine. This is gonna to be really strange, and it seems to be a real issue (maybe a launch_ros
case). I will report more precisely in this ticket after I do more research on this.
Moving the discussion of problem No.2 from separate ticket to here (to not lose the thread):
When running two or more component nodes inside one container with some non-empty root namespace, composable nodes inside this container can not read ROS-parameters from YAML correctly.
Consider the attached example package: nav2_pkg.zip
This example contains two lifecycle nodes, exported as composable nodes, that are reading node_param
parameter from input YAML-file (by-default parameter's value is UNDEFINED
):
node1:
ros__parameters:
node_param: "PASS"
node2:
ros__parameters:
node_param: "PASS"
Unpack it in navigation2
source directory and build it:
colcon build --symlink-install --packages-select nav2_pkg
Then run the example with some root namespace (e.g. robot1
):
ros2 launch nav2_pkg test_launch.py namespace:=robot1 use_composition:=True
Here the launch-file - is a simplified localization_launch.py
from nav2_bringup.
Expected behavior
Both parameters to be set to PASS
value, that could be observed from logs.
Actual behavior
In case of composable node on robot1
root namespace and we could observe some kind of undefined behavior: for the first initialized node this parameter to be read from input YAML, while for other ones it can not and to be left to default:
ros2 launch nav2_pkg test_launch.py namespace:=robot1 use_composition:=True
...
[component_container_isolated-1] [INFO] [1674138269.616420930] [robot1.lifecycle_manager_localization]: Configuring node1
[component_container_isolated-1] [INFO] [1674138269.616951726] [robot1.node1]: Configuring
[component_container_isolated-1] [INFO] [1674138269.617034718] [robot1.node1]: NODE1 name = node1, namespace = /robot1
[component_container_isolated-1] [INFO] [1674138269.617052866] [robot1.node1]: node_param = PASS
[component_container_isolated-1] [INFO] [1674138269.617675336] [robot1.lifecycle_manager_localization]: Configuring node2
[component_container_isolated-1] [INFO] [1674138269.617915953] [robot1.node2]: Configuring
[component_container_isolated-1] [INFO] [1674138269.617993659] [robot1.node2]: NODE2 name = node2, namespace = /robot1
[component_container_isolated-1] [INFO] [1674138269.618012148] [robot1.node2]: node_param = UNDEFINED
Running the same example, but with conventional separate nodes will give correct result:
ros2 launch nav2_pkg test_launch.py namespace:=robot1 use_composition:=False
...
[lifecycle_manager-3] [INFO] [1674138305.137439608] [robot1.lifecycle_manager_localization]: Configuring node1
[node1-1] [INFO] [1674138305.138322511] [robot1.node1]: Configuring
[node1-1] [INFO] [1674138305.138406182] [robot1.node1]: NODE1 name = node1, namespace = /robot1
[node1-1] [INFO] [1674138305.138425920] [robot1.node1]: node_param = PASS
[lifecycle_manager-3] [INFO] [1674138305.139189646] [robot1.lifecycle_manager_localization]: Configuring node2
[node2-2] [INFO] [1674138305.139818653] [robot1.node2]: Configuring
[node2-2] [INFO] [1674138305.139908128] [robot1.node2]: NODE2 name = node2, namespace = /robot1
[node2-2] [INFO] [1674138305.139929253] [robot1.node2]: node_param = PASS
Moreover, if remove root-namespace (leave it empty), both composition nodes will work correctly:
ros2 launch nav2_pkg test_launch.py use_composition:=True
...
[component_container_isolated-1] [INFO] [1674138331.440440362] [lifecycle_manager_localization]: Configuring node1
[component_container_isolated-1] [INFO] [1674138331.440878937] [node1]: Configuring
[component_container_isolated-1] [INFO] [1674138331.440951747] [node1]: NODE1 name = node1, namespace = /
[component_container_isolated-1] [INFO] [1674138331.440970335] [node1]: node_param = PASS
[component_container_isolated-1] [INFO] [1674138331.441564540] [lifecycle_manager_localization]: Configuring node2
[component_container_isolated-1] [INFO] [1674138331.441749216] [node2]: Configuring
[component_container_isolated-1] [INFO] [1674138331.441810231] [node2]: NODE2 name = node2, namespace = /
[component_container_isolated-1] [INFO] [1674138331.441827782] [node2]: node_param = PASS
Additional information
The fact that undefined behavior appears only for composable nodes with non-empty root namespace leads to thought that the problem might be in composable node running mechanism, or RewrittenYAML parser. Needs an additional investigation.
For the second issue, opened a new bug on ROS2 launch: https://github.com/ros2/launch_ros/issues/344
I'd make sure its not the rewritten yaml parser by making a yaml file that has the namespaces setup in the yaml file itself and then rerunning your experiment using that file without the Rewritten yaml function. That way it doesn't rely on any non-standard functions and should just be reading the raw yaml file as setup. Does that still have the issue? I see that your zip file in the launch ticket has the yaml file with the namespace already in it without rewritten yaml, so I suspect you already did that and still having the issue.
What's special about node1 vs node2? If you swap the order in the launch file, does the behavior reverse? I'm trying to suss out why one would work at all if the other does not, since the names and namespaces are totally arbitrary strings.
Edit: I just noticed you nested multiple nodes under the same namespace in the yaml file. I don't actually know if you can do that. I think you need to have the namespace in each - which is what the rewritten yaml should be doing. Does that fix the issue:
robot1:
node1:
ros__parameters:
node_param: "PASS"
robot1:
node2:
ros__parameters:
node_param: "PASS"
I see that your zip file in the launch ticket has the yaml file with the namespace already in it without rewritten yaml, so I suspect you already did that and still having the issue.
Yep, that is that was done for the ROS2 launch ticket: refuse any Nav2 API: RewrittenYaml
and nav2_utils::LifecycleNodes
, to rely on ROS2 API only for this report. That is why, I suggest the problem is not related to Nav2 at all.
What's special about node1 vs node2? If you swap the order in the launch file, does the behavior reverse? I'm trying to suss out why one would work at all if the other does not, since the names and namespaces are totally arbitrary strings.
Node1 and Node2 are totally identical. The difference is only in their names, class names, library names, etc... If swap node1 and node2 running sequence, nothing changes: "PASS" case will be the same independently from running order. However, if swap node1 and node2 parameters in the YAML-file, the "PASS" will go to the first met in YAML.
I just noticed you nested multiple nodes under the same namespace in the yaml file. I don't actually know if you can do that. I think you need to have the namespace in each - which is what the rewritten yaml should be doing. Does that fix the issue
Duplicating of namespaces in YAML should not work, as I recall from the https://github.com/ros-planning/navigation2/issues/1217 ticket discussions.
If check the dump output of RewrittenYaml::perform()
, it will produce only one root namespace when doing root key substitution (taken from the example for this ticket):
Dumping /home/amerzlyakov/nav2_ws/install/nav2_pkg/share/nav2_pkg/params/params.yaml :
robot1:
node1:
ros__parameters:
node_param: PASS
node2:
ros__parameters:
node_param: PASS
Therefore, for the ROS2 launch ticket, YAML-config was chosen the same structure. If intentionally set two namespaces as from suggestion above, the situation doesn't change much: PASS and UNDEFINED values are being swapped. It seems that YAML-parser will use only one of key, in my experiment it was the second one for node2. But, unfortunately, this is not resolving the situation.
Moreover, since for the non-composable node case everything works fine for all YAML variations, I believe that the problem is hiding somewhere in the composable node running mechanism.
Problem No.3 detected on multi-robot launch:
BT node loading failed due to unrecognized WouldAPlannerRecoveryHelp
node:
$ ros2 launch nav2_bringup tb3_simulation_launch.py headless:=False use_namespace:=True namespace:=robot1 params_file:=${HOME}/nav2_ws/src/navigation2/nav2_bringup/params/nav2_multirobot_params_1.yaml rviz_config_file:=${HOME}/nav2_ws/src/navigation2/nav2_bringup/rviz/nav2_namespaced_view.rviz use_composition:=False
...
[planner_server-11] [INFO] [1673955531.322241057] [robot1.planner_server]: Creating bond (planner_server) to lifecycle manager.
[lifecycle_manager-16] [INFO] [1673955531.428133478] [robot1.lifecycle_manager_navigation]: Server planner_server connected with bond.
[lifecycle_manager-16] [INFO] [1673955531.428265895] [robot1.lifecycle_manager_navigation]: Activating behavior_server
[behavior_server-12] [INFO] [1673955531.429524420] [robot1.behavior_server]: Activating
[behavior_server-12] [INFO] [1673955531.429621378] [robot1.behavior_server]: Activating spin
[behavior_server-12] [INFO] [1673955531.429648553] [robot1.behavior_server]: Activating backup
[behavior_server-12] [INFO] [1673955531.429670626] [robot1.behavior_server]: Activating drive_on_heading
[behavior_server-12] [INFO] [1673955531.429689944] [robot1.behavior_server]: Activating wait
[behavior_server-12] [INFO] [1673955531.429715559] [robot1.behavior_server]: Creating bond (behavior_server) to lifecycle manager.
[lifecycle_manager-16] [INFO] [1673955531.534140604] [robot1.lifecycle_manager_navigation]: Server behavior_server connected with bond.
[lifecycle_manager-16] [INFO] [1673955531.534221559] [robot1.lifecycle_manager_navigation]: Activating bt_navigator
[bt_navigator-13] [INFO] [1673955531.534882797] [robot1.bt_navigator]: Activating
[bt_navigator-13] [ERROR] [1673955531.535402206] [robot1.bt_navigator]: Exception when loading BT: Error at line 14: -> Node not recognized: WouldAPlannerRecoveryHelp
[bt_navigator-13] [ERROR] [1673955531.535481053] [robot1.bt_navigator]: Error loading XML file: /home/amerzlyakov/nav2_ws/install/nav2_bt_navigator/share/nav2_bt_navigator/behavior_trees/navigate_to_pose_w_replanning_and_recovery.xml
[lifecycle_manager-16] [ERROR] [1673955531.536951736] [robot1.lifecycle_manager_navigation]: Failed to change state for node: bt_navigator
[lifecycle_manager-16] [ERROR] [1673955531.537064459] [robot1.lifecycle_manager_navigation]: Failed to bring up all requested nodes. Aborting bringup.
[rviz2-5] Start navigation
[rviz2-5] [INFO] [1673955534.212595970] [robot1._]: NavigateToPose will be called using the BT Navigator's default behavior tree.
[rviz2-5] [ERROR] [1673955534.216347017] [robot1._]: Goal was rejected by server
The problem appears because of nav2_would_a_planner_recovery_help_condition_bt_node
libarary is not being loaded, which is caused by outdated nav2_multirobot_params_1/2.yaml
config files where the actual BT configuration should be added:
diff --git a/nav2_bringup/params/nav2_multirobot_params_1.yaml b/nav2_bringup/params/nav2_multirobot_params_1.yaml
index 21bc2e67..2f4cd036 100644
--- a/nav2_bringup/params/nav2_multirobot_params_1.yaml
+++ b/nav2_bringup/params/nav2_multirobot_params_1.yaml
@@ -70,6 +70,10 @@ bt_navigator:
- nav2_goal_updated_condition_bt_node
- nav2_globally_updated_goal_condition_bt_node
- nav2_is_path_valid_condition_bt_node
+ - nav2_are_error_codes_active_condition_bt_node
+ - nav2_would_a_controller_recovery_help_condition_bt_node
+ - nav2_would_a_planner_recovery_help_condition_bt_node
+ - nav2_would_a_smoother_recovery_help_condition_bt_node
- nav2_initial_pose_received_condition_bt_node
- nav2_reinitialize_global_localization_service_bt_node
- nav2_rate_controller_bt_node
This fixes the situation: BT tree loads and works well.
Problem No.4 detected on multi-robot launch:
Robots are being started, but the local costmap is empty as follows on the screenshot below:
Local costmap is not being updated due to absence of any source for obstacle layer. The problem is in incorrect YAML-file:
$ cat nav2_multirobot_params_1.yaml
...
local_costmap:
local_costmap:
ros__parameters:
global_frame: odom
rolling_window: true
width: 3
height: 3
resolution: 0.05
robot_radius: 0.22
plugins: ["obstacle_layer", "inflation_layer"]
inflation_layer:
plugin: "nav2_costmap_2d::InflationLayer"
cost_scaling_factor: 3.0
inflation_radius: 0.55
obstacle_layer:
plugin: "nav2_costmap_2d::ObstacleLayer"
enabled: True
observation_sources: scan <- should be here
scan:
topic: /robot1/scan
max_obstacle_height: 2.0
clearing: True
marking: True
raytrace_max_range: 3.0
raytrace_min_range: 0.0
obstacle_max_range: 2.5
obstacle_min_range: 0.0
static_layer:
map_subscribe_transient_local: True
always_send_full_costmap: True
observation_sources: scan <- looks like moved here by mistake
Moreover, both nav2_multirobot_params_1/2.yaml
configs are better to have to actual set of plugins from latest nav2_params.yaml
.
After synching-up to latest changes, multi-robot test is working fine for non-composition nodes:
Problem 1, 3-4 fixed | Separate nodes | Composite nodes |
---|---|---|
Single bot simulation test | PASS | PASS |
Multi-robot simulation test | PASS | FAIL due to problem 2 (not fixed yet) |
However, if swap node1 and node2 parameters in the YAML-file, the "PASS" will go to the first met in YAML.
That would seem to point to my comment that perhaps multiple nodes cannot be nested together in yaml files?
it will produce only one root namespace when doing root key substitution
.... and that works for us? since for the non-composable node case everything works fine for all YAML variations
implies so. If that is working right, then that might be a point of where the composition stuff isn't working properly: composition nodes aren't parsing multiple childs in a global yaml file namespace.
Sure on the BT node stuff - that's an easy fix just to add in the missing nodes to the multirobot launch file (or just delete that parameter from the multirobot files and use the defaults in the code, which I'm pretty good about keeping up with).
On the local costmap stuff, yeah looks like someone (probably me?) copy pasted something around incorrectly
Please open a PR to resolve those sub-points
.... and that works for us?
One root namespace is that exactly RewrittenYaml
produces, and this did not initially not work for us in composition nodes case. If take an experiment with intentionally repeating root namespaces (which is actually should not work), it also does not work for composition nodes, which is expected behavior. So, currently we have no visible and straight workaround for this problem.
If that is working right, then that might be a point of where the composition stuff isn't working properly: composition nodes aren't parsing multiple childs in a global yaml file namespace.
If I understand you correctly, you are agree that this is the point where we have wrong YAML-parser behavior for composable nodes, right? Or do you mean that composition nodes should not parse (are not expected to do) multiple childs in a global yaml file namespace, like:
aaa:
bbb:
ros_params:
some_val: "BBB"
ccc:
ros_params:
another_val: "CCC"
The Issue2 seems to be fixed by the following patch in launch_ros
:
diff --git a/launch_ros/launch_ros/utilities/to_parameters_list.py b/launch_ros/launch_ros/utilities/to_parameters_list.py
index 534ec7e..6aad203 100644
--- a/launch_ros/launch_ros/utilities/to_parameters_list.py
+++ b/launch_ros/launch_ros/utilities/to_parameters_list.py
@@ -48,8 +48,8 @@ def __normalize_parameters_dict(dictionary):
if isinstance(value, dict):
keys.append(key.lstrip('/'))
result_dict = normalize_parameters_dict(value, keys, result_dict)
- # Reset keys in case there are multiple ros__parameter entries
- keys = []
+ # Clean-up keys for the case of multiple nodes/ros__parameter entries
+ keys.pop()
return result_dict
, it also does not work for composition nodes, which is expected behavior
What if we make the rewritten yaml function not place all under a single parent namespace, but each node gets its own root namespace... namespace? That would fix up Nav2 for the time being, no?
aaa:
bbb:
ros_params:
some_val: "BBB"
aaa:
ccc:
ros_params:
another_val: "CCC"
If I understand you correctly, you are agree that this is the point where we have wrong YAML-parser behavior for composable nodes, right? Or do you mean that composition nodes should not parse (are not expected to do) multiple childs in a global yaml file namespace
My previous understanding was that no nodes could do that, so I'm surprised to hear otherwise. I don't think its expected for composition nodes to act differently from normal nodes, no.
Edit: It looks like you're fixing launch itself, cool! Then that might not be necessary to change rewritten yaml
Now multi-TB3 simulation works fine after all fixes were applied.
Awesome! Thanks for the investigation!
Has this fix made it into the latest Humble binaries?