launch_ros icon indicating copy to clipboard operation
launch_ros copied to clipboard

Hang after repeated ctrl+c on Windows

Open Ace314159 opened this issue 2 years ago • 3 comments

Bug report

Required Info:

  • Operating System:
    • Windows 10
  • Installation type:
    • From source
  • Version or commit hash:
    • Foxy
  • DDS implementation:
    • N/A
  • Client library (if applicable):
    • N/A

Steps to reproduce issue

This occurs with any launch file where quitting takes enough time for multiple ctrl+c's to be sent. I've verified the issue with the following two:

ros2 launch composition composition_demo.launch.py
ros2 launch lifecycle lifecycle_demo.launch.py

Run either of the two commands, and after all the nodes launch, press ctrl+c repeatedly really fast. It might take a couple of tries.

Expected behavior

The ros2 process should exit without hanging.

Actual behavior

The ros2 process hangs forever.

Additional information

This issue is caused by this Python bug where a KeyboardInterrupt hangs thread.join on Windows. Although this has been fixed in newer Python versions. ROS Foxy uses Python 3.8 which does not have this fix. I have verified that applying the fix does solve the issue.

The ROS code that uses thread.join is here: https://github.com/ros2/launch_ros/blob/a58b2e650ca3b11c733e350c69f7462e12c7fa9a/launch_ros/launch_ros/ros_adapters.py#L88

Log from the command:

[INFO] [launch]: All log files can be found below C:\Users\{userName}\.ros\log\{logName}
[INFO] [launch]: Default logging verbosity is set to INFO
[INFO] [component_container.EXE-1]: process started with pid [21788]
[component_container.EXE-1] [INFO] [1649431966.573165700] [my_container]: Load Library: C:\opt\ros\foxy\x64/bin/talker_component.dll
[component_container.EXE-1] [INFO] [1649431966.579357900] [my_container]: Found class: rclcpp_components::NodeFactoryTemplate<composition::Talker>
[component_container.EXE-1] [INFO] [1649431966.579479100] [my_container]: Instantiate class: rclcpp_components::NodeFactoryTemplate<composition::Talker>
[INFO] [launch_ros.actions.load_composable_nodes]: Loaded node '/talker' in container '/my_container'
[component_container.EXE-1] [INFO] [1649431966.594363300] [my_container]: Load Library: C:\opt\ros\foxy\x64/bin/listener_component.dll[component_container.EXE-1] [INFO] [1649431966.595254800] [my_container]: Found class: rclcpp_components::NodeFactoryTemplate<composition::Listener>
[component_container.EXE-1] [INFO] [1649431966.595317300] [my_container]: Instantiate class: rclcpp_components::NodeFactoryTemplate<composition::Listener>
[INFO] [launch_ros.actions.load_composable_nodes]: Loaded node '/listener' in container '/my_container'
[WARNING] [launch]: user interrupted with ctrl-c (SIGINT)
[WARNING] [launch]: user interrupted with ctrl-c (SIGINT) again, ignoring...
[WARNING] [launch]: user interrupted with ctrl-c (SIGINT) again, ignoring...
[INFO] [component_container.EXE-1]: process has finished cleanly [pid 21788]

Ace314159 avatar Apr 08 '22 15:04 Ace314159

Any suggestion for what we could do in launch to workaround the known issue?

jacobperron avatar Apr 29 '22 00:04 jacobperron

I guess this affects Galactic (and maybe Humble) too, since they are also using Python 3.8 afaik.

jacobperron avatar Apr 29 '22 00:04 jacobperron

I'm not sure how to fix other than just updating to Python 3.9. It might be possible to make the thread a daemon and not use join, but I'm not sure if that would work or have other side effects.

Ace314159 avatar Apr 29 '22 19:04 Ace314159