launch
launch copied to clipboard
launch not killing processes started with shell=True
Bug report
Required Info:
- Operating System:
- Ubuntu 20.04
- Installation type:
- binaries
- Version or commit hash:
- Foxy 0.10.6-1focal.20210901.033555
- DDS implementation:
- Fast
Steps to reproduce issue
import launch_testing
import unittest
import os
from launch import LaunchDescription
from launch.actions import DeclareLaunchArgument, ExecuteProcess
from launch.substitutions import PathJoinSubstitution, LaunchConfiguration
from launch_ros.actions import Node
from launch.actions import IncludeLaunchDescription
from launch.substitutions import PathJoinSubstitution
from launch_ros.substitutions import FindPackageShare
from launch.launch_description_sources import PythonLaunchDescriptionSource
def generate_test_description():
# This is necessary to get unbuffered output from the process under test
proc_env = os.environ.copy()
proc_env['PYTHONUNBUFFERED'] = '1'
dut_process = ExecuteProcess(cmd=['/usr/bin/sleep', '5'])
pkg_dir = FindPackageShare('gazebo_ros')
full_path = PathJoinSubstitution([pkg_dir] + ['launch', 'gazebo.launch.py'])
gazebo_launch = IncludeLaunchDescription(
PythonLaunchDescriptionSource(
full_path))
return LaunchDescription([
gazebo_launch,
dut_process,
launch_testing.actions.ReadyToTest(),
]), {'dut_process': dut_process}
class TestWait(unittest.TestCase):
def test_wait_for_end(self, proc_output, proc_info, dut_process):
# Wait until process ends
proc_info.assertWaitForShutdown(process=dut_process, timeout=10)
@ launch_testing.post_shutdown_test()
class TestSuccessfulExit(unittest.TestCase):
def test_exit_code(self, proc_info, dut_process):
# Check that dut_process finishes with code 0
launch_testing.asserts.assertExitCodes(proc_info, process=dut_process)
Execute it with launch_test
or as part of add_launch_test
in CMake.
Expected behavior
When it ends, gazebo should be killed.
Actual behavior
Gazebo is not killed.
Additional information
Happens with other nodes besides gazebo. But gazebo is particularly troublesome because I cannot have 2 tests running sequentially using gazebo, because the first test didn't kill the server, and only one server may be running in a machine.
Might be related to https://github.com/ros2/launch/issues/495 because if instead of launch_testing
I use ros2 launch gazebo_ros gazebo.launch.py
, I can kill it properly with Ctrl+C, but not with kill -SIGINT
. Even if I use ros2 launch gazebo_ros gazebo.launch.py --noninteractive
Hmm, @v-lopez this looks like Gazebo not forwarding signals properly. Would you mind trying to force launch_testing
's LaunchService
instance to be non-interactive? If that fixes the issue, we can submit it as a patch.
I tried that (and verified that it was not being overwritten to False
), and the behavior was the same.
I have the same problem with a standalone unity executable not being killed, but I can kill all the processes correctly if I Ctrl+C before the test is complete
My current workaround is to add a post_shutdown_test
to kill the remaining processes
@launch_testing.post_shutdown_test()
class TestShutdown(unittest.TestCase):
def test_kill_sim(self):
unity_pid = check_output(["pidof", "-z", "unity_executable.x86_64"])
print('got unity pid')
unity_pid = int(unity_pid)
print('unity pid is: ' + str(unity_pid))
os.kill(unity_pid, signal.SIGKILL)
print('Killed unity')
I think the reason this is happening is because the executable is ran from a series of launch files, then the last launch file calls a bash script that uses ros2 run unity_executable.x86_64
. I can provide more details if needed
I'm experiencing this issue with gzserver, gzclient, and rqt. Here's a minimal reproducible example with Gazebo:
import sys
import time
import unittest
import launch
from launch.actions import IncludeLaunchDescription
from launch.launch_description_sources import PythonLaunchDescriptionSource
from launch_ros.substitutions import FindPackageShare
from launch_testing.actions import ReadyToTest
import launch_testing.markers
import pytest
@pytest.mark.launch_test
def generate_test_description():
return launch.LaunchDescription(
[
IncludeLaunchDescription(
PythonLaunchDescriptionSource([FindPackageShare('gazebo_ros'), '/launch/gazebo.launch.py'])
),
ReadyToTest(),
])
class MyTestFixture(unittest.TestCase):
def test_something(self):
time.sleep(10.0)
print('END TEST', file=sys.stderr)
We were facing a similar issue in ignition fortress, where launch was not able to kill the gazebo process. As @hidmic mentioned, it was related to Gazebo not forwarding signals. That was fixed using this PR, not sure if that is related : https://github.com/gazebosim/gz-launch/pull/151
I'm experiencing this issue with gzserver, gzclient, and rqt. Here's a minimal reproducible example with Gazebo:
import sys import time import unittest import launch from launch.actions import IncludeLaunchDescription from launch.launch_description_sources import PythonLaunchDescriptionSource from launch_ros.substitutions import FindPackageShare from launch_testing.actions import ReadyToTest import launch_testing.markers import pytest @pytest.mark.launch_test def generate_test_description(): return launch.LaunchDescription( [ IncludeLaunchDescription( PythonLaunchDescriptionSource([FindPackageShare('gazebo_ros'), '/launch/gazebo.launch.py']) ), ReadyToTest(), ]) class MyTestFixture(unittest.TestCase): def test_something(self): time.sleep(10.0) print('END TEST', file=sys.stderr)
Fix here : https://github.com/ros-simulation/gazebo_ros_pkgs/pull/1376
Another minimal example :
import sys
import time
import unittest
import launch
from launch.actions import IncludeLaunchDescription, ExecuteProcess
from launch.launch_description_sources import PythonLaunchDescriptionSource
from launch_ros.substitutions import FindPackageShare
from launch_testing.actions import ReadyToTest
import launch_testing.markers
import pytest
@pytest.mark.launch_test
def generate_test_description():
return launch.LaunchDescription(
[
ExecuteProcess(
cmd=['python3','test1.py'],
shell=True,
output='both'
),
ExecuteProcess(
cmd=['python3','test2.py'],
shell=False,
output='both'
),
ReadyToTest(),
])
class MyTestFixture(unittest.TestCase):
def test_something(self):
time.sleep(10.0)
print('END TEST', file=sys.stderr)
Where both test files are just :
import time
while 1 :
time.sleep(0.5)
print("Hello 1")
test1.py
does not terminate, yet test2.py
does.
This is not a problem IMO.
When you use shell=True
, you're running a shell process, which in this case will run python3 test1.py
as a subprocess.
Launch is correctly sending a signal to the shell process, but that one is not forwarding it to its subprocess.
You probably can fix that be trapping SIGINT/SIGTERM and killing subprocesses before exiting the shell (e.g. with kill -$SIGNAL $(jobs -p)
).
I don't think we can do anything by default in launch that's reasonable, people should be careful when using shell=True
.
I have updated the issue title, as I don't think the original one was correct.
This is easy to reproduce from a pure launch file:
ros2 launch ./test.launch.py &
kill -SIGINT $!
# ./test.launch.py
import launch
from launch.actions import ExecuteProcess
PYTHON_SCRIPT="""\
import time
while 1:
time.sleep(0.5)
"""
def generate_launch_description():
return launch.LaunchDescription(
[
ExecuteProcess(
cmd=['python3', '-c', f'"{PYTHON_SCRIPT}"'],
shell=True,
output='both'
),
ExecuteProcess(
cmd=['python3', '-c', f'{PYTHON_SCRIPT}'],
shell=False,
output='both'
),
])
after waiting the launch process to finish (it will take some seconds and log some stuff), it's easy to check that the second process was killed and the first wasn't (e.g. using ps -elf
).
Note:
- The issue isn't noticed if instead of using
kill -SIGINT $!
the user pressesctrl+c
. That's because pressingctrl+c
will send a SIGINT signal to all the processes of the group (all the processes launch from the launch file). - Currently if SIGTERM is sent, instead of SIGINT, no process is being killed. That seems like a separate issue that should be handled as well.
- We could maybe try to send a signal to the whole group (in the platforms that support this), or to find all the children of the launch process recursively (e.g. with psutil) to solve the problem.
I will give it a try to the last option, but I'm not sure how multiplatform that's going to be.
cc @jacobperron
I have opened https://github.com/ros2/launch/pull/632, which will fix the issue.
This issue has been mentioned on ROS Discourse. There might be relevant details there:
https://discourse.ros.org/t/sequentially-starting-launch-files-using-ros-2-python-when-a-specific-log-message-is-detected/30633/3