avocado Spawner support for killing tasks

Is your feature request related to a problem? Please describe. The legacy "runner" architecture has the ability to kill tests, and its children. But, completely limited to processes. With the nrunner architecture, which relies on spawners to create and check on tasks, this feature does not exist.

Spawners have the ability to create tasks (reporting success or not), and check on their status, but nothing that allows tasks to be forcefully terminated.

When users have tests a current Avocado job running, and signal that they don't want to proceed any further, Avocado (nrunner) should, in addition to not spawn pending tasks, also attempt (best effort) to terminate existing ones.

Describe the solution you'd like A new interface can be added to the Spanwer plugin interface, such as terminate_task(). For the process spawner, the implementation kill probably be similar to the existing legacy runner approach of sending SIGKILL/SIGTERM to the task process and its children. But for the Podman spawner, it can be as simple as killing (podman kill) the container itself (and thus the task running inside it would be killed too).

This is related to https://github.com/avocado-framework/avocado/issues/4911, but it's considered an enhancement instead of a bug.

Sep 30 '21 02:09 clebergnu

Adding the QEMU label, because this was partially motivated by tests leaving QEMU running.

Mar 16 '22 13:03 clebergnu

I'll re-evaluate this in terms of latest Avocado and QEMU.

Mar 15 '23 13:03 clebergnu

As a side note, we already also have an implementation for the LXC spawner here:

    async def terminate_task(self, runtime_task):
        container = lxc.Container(runtime_task.spawner_handle)

        # Stop the container
        if not container.shutdown(30):
            LOG.warning("Failed to cleanly shutdown the container, forcing.")
            if not container.stop():
                LOG.error("Failed to kill the container")
                return False

However, I haven't fully tested it since we use a different scheduler and not the terminate_worker.terminate_tasks_*() (if I understand correctly this is the way it is done here) and I am not even fully sure if this implementation is a good idea since killing the scheduler/job process already kills all processes spawned via lxc-attach making interruption on LXC containers a fairly clean process and leaves the containers running (something that we may even allow as LXC containers represent possibly running subsystems).

Jun 13 '23 08:06 pevogam

I tested #5788 to the limits, and it solves this issue AFAICT. @pevogam if you think we have issues with the LXC Spawner, adding the option to destroy() the container, we can add on top of this. Anyone finding issues here, feel free to reopen this issue.

Dec 21 '23 12:12 clebergnu

avocado avocado copied to clipboard

Spawner support for killing tasks

avocado
avocado copied to clipboard