tink icon indicating copy to clipboard operation
tink copied to clipboard

Add ability to reboot the machine after workflow is finished

Open invidian opened this issue 4 years ago • 11 comments

For workflows, which provision the OS, it would be nice if the workflow itself could reboot the machine, after it's done, so the machine can boot itself into target OS, so the upper orchestration system (e.g. person who monitors provisioning process, some kind of logic which use IPMI etc.) don't need to care about that.

Things to consider:

  • worker can be part of multiple workflows. Perhaps reboot should only happen when all workflows are successfully finished.
  • perhaps workflow could indicate, that after it's finished, the reboot is needed e.g. by setting reboot parameter to true.
  • the action or task can't trigger a reboot by itself, as this will shut down the worker and it won't be able to report that reboot task succeeded

invidian avatar Apr 22 '20 13:04 invidian

it seems that we now have a documented way to do a reboot from an action at https://docs.tinkerbell.org/actions/action-architecture/#namespace:

When an action attempts to do these steps in a container in its own namespace, nothing will occur as PID 1 is usually the process in the action container. To allow the expected behaviour an action can use pid: host in its configuration, this will mean that the action processes will be amongst all of the processes on the host itself (including the "real" PID 1). With the action in the host process ID namespace both a reboot or kexec will be able to work as expected.

It this issue about improving on that?

rgl avatar May 25 '21 13:05 rgl

This is fixed in tink-worker. This can probably be closed! 😀

thebsdbox avatar May 25 '21 13:05 thebsdbox

@thebsdbox, by fixed, you mean using an action with pid: host?

having a docs example on how to reboot from a workflow would also be really nice :-)

I found a reboot example at https://docs.tinkerbell.org/deploying-operating-systems/examples-win/#creating-a-reboot-action-dockerfile:

FROM busybox ENTRYPOINT [ "touch", "/worker/reboot" ]

is that it? we just need to create a new file named /worker/reboot?

rgl avatar May 25 '21 14:05 rgl

Creating a file named /worker/reboot does not trigger a reboot from tink-worker:

Screenshot_rpi-tinkerbell-vagrant_bios_worker_2021-05-26_09:12:14

Here's the workflow status:

+----------------------+--------------------------------------+
| FIELD NAME           | VALUES                               |
+----------------------+--------------------------------------+
| Workflow ID          | be378bb1-bdf9-11eb-9be0-0242ac120005 |
| Workflow Progress    | 100%                                 |
| Current Task         | hello-world                          |
| Current Action       | reboot                               |
| Current Worker       | 00000000-0000-4000-8000-080027000001 |
| Current Action State | STATE_SUCCESS                        |
+----------------------+--------------------------------------+
+--------------------------------------+-------------+-------------+----------------+---------------------------------+---------------+
| WORKER ID                            | TASK NAME   | ACTION NAME | EXECUTION TIME | MESSAGE                         | ACTION STATUS |
+--------------------------------------+-------------+-------------+----------------+---------------------------------+---------------+
| 00000000-0000-4000-8000-080027000001 | hello-world | reboot      |              0 | Started execution               | STATE_RUNNING |
| 00000000-0000-4000-8000-080027000001 | hello-world | reboot      |              0 | finished execution successfully | STATE_SUCCESS |
+--------------------------------------+-------------+-------------+----------------+---------------------------------+---------------+

rgl avatar May 26 '21 08:05 rgl

Ah this needs hook.. hook has the logic to watch for the reboot.

thebsdbox avatar May 26 '21 12:05 thebsdbox

Can we use sysrq-r from an action? https://hub.docker.com/r/mlafeldt/sysrq/ for example.

the action or task can't trigger a reboot by itself, as this will shut down the worker and it won't be able to report that reboot task succeeded

Does the action need to be Tinkerbell specific and act as the worker to signal success?

displague avatar Oct 28 '21 15:10 displague

Built a docker image as per the example @rgl mentioned here already to no avail:

The "touch" is going nowhere and thus the rebootWatch() never fires.

A manual touch in the getty container to "/run/worker/reboot" works, so the watch is active. Just looks the volume mapping is wrong? (/worker:/worker)

Edit: it works; just the workflow was hanging somehow. recreated that and works as advertised: -build docker image as in the windows example -tag+push to local registry -add the action as in the same example

profi...reboot :)

double-p avatar Apr 06 '22 12:04 double-p

  - name: "reboot into Windows"
    image: reboot:latest
    timeout: 90
    volumes:
    - /worker:/worker

I encountered the same issue in rebooting into Windows, the action failed (STATE_FAILED). Is there any place I can lookup for the error message?

yeahdongcn avatar Dec 23 '22 03:12 yeahdongcn

  - name: "reboot into Windows"
    image: reboot:latest
    timeout: 90
    volumes:
    - /worker:/worker

I encountered the same issue in rebooting into Windows, the action failed (STATE_FAILED). Is there any place I can lookup for the error message?

It turns out the document is incorrect. I just sent out a PR to fix it.

yeahdongcn avatar Dec 23 '22 05:12 yeahdongcn

We intend on drawing up a proposal for embedding restart capabilities into workflows so we don't need to rely on actions. This will compliment a want to see workflows consistently transition to an end state which doesn't happen if the restart beats the restart actions update currently.

chrisdoherty4 avatar May 08 '23 22:05 chrisdoherty4

https://github.com/tinkerbell/roadmap/issues/29 will see this come to fruition.

chrisdoherty4 avatar Oct 16 '23 01:10 chrisdoherty4