gitpod icon indicating copy to clipboard operation
gitpod copied to clipboard

Allow end-tasks, which will run when a workspace stops (timeout)

Open shaal opened this issue 4 years ago • 36 comments

Similar to start-tasks I propose adding end-tasks. End-tasks will get triggered and run when a workspace gets stopped (timeout)

Why it's necessary:

  • In a "regular" (non-gitpod) setup on a local machine, working on projects that store content and changes in a database is fairly straight forward. I can pull the database once, switch between different branches, make changes in the database, and continue working on the project for many days.
  • In a Gitpod setup, I can pull the database once, and any changes I make in the database will persist through timeout and a restart of a workspace. The next day (or minute) I want to work on a different branch so I start a NEW workspace, and poof! the database is gone.

How it works / Why it's a good idea:

  • Once end-tasks are available, I can make sure that a machine that is about to be stopped, can run a command that will store the important information / database / docker-image in a safe place. And in the next time I open a new workspace, I can choose fetching that database I saved.
  • I am sure there are many more scenarios possible that will be helpful for people once end-tasks feature is available.

shaal avatar Apr 16 '21 04:04 shaal

Related issue: https://github.com/gitpod-io/gitpod/issues/4055 Technical details: https://github.com/gitpod-io/gitpod/issues/1961

shaal avatar May 30 '21 10:05 shaal

hey @shaal, the feature makes absolute sense, trying to understand your particular use case better. It sounds like you try to do two things

  1. persist in-memory state to disk before stop, so when I restart a workspace it has the same state again.
  2. keep state across multiple workspaces.

I have no questions regarding 2). I wonder what the scope of that state is is that per user per project? Or is it per project? (in which case I wonder if it should not be part of the init tasks or checked into git).

I would like to understand how you intend to surface to users why they have the state they are in and how they can control it. What would a user do if they want a clean slate for some reason or have a different DB schema, because they are working on some branch that has migrations.

These issues are generally the reason why I'd recommend creating fresh workspaces per branch and don't share such data across different project states. Within a workspace, I want to keep the state, of course. So when I stop it and later start it again it definitely should have the same DB state without question.

svenefftinge avatar Jun 17 '21 07:06 svenefftinge

I think for commit, because if you delete accidentally a workspace, you cannot react your changes. So I want a default commit and push with timestamps or slug to save in a new branch.

chlbri avatar Jul 13 '21 18:07 chlbri

I created an example of end-task that I want to use: https://github.com/shaal/DrupalPod/pull/18

When a workspace shuts down - .gitpod/aws-backup.sh is called (creates a binary mysql backup and stores it in AWS) When I open a new workspace where I want to use the previous' workspace's database - I'll run .gitpod/aws-restore.sh which will restore the latest backup for this branch. Alternatively, I can run .gitpod/aws-restore [name_of_backup] to restore a specific named backup I created on a separate workspace.

shaal avatar Jul 13 '21 20:07 shaal

This will great. It will allow me to do some simple telemetry or housekeeping tasks!

orellabac avatar Jul 20 '21 16:07 orellabac

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Oct 18 '21 17:10 stale[bot]

Can we please add the label meta: never-stale to this issue?

shaal avatar Oct 18 '21 19:10 shaal

Ok

Envoyé de mon iPhone

Le 18 oct. 2021 à 19:49, Ofer Shaal @.***> a écrit :

 Can we please add the label meta: never-stale to this issue?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

chlbri avatar Oct 20 '21 10:10 chlbri

I'm looking for this feature too and ended up here. My use case is as follows:

I'd like to use Gitpod in teaching an introductory programming course. These beginner students have enough on their plates trying to learn to code without having to learn Git (for now), so I'd forsee providing a shell script in the workspace which does git add/commit/push for them without blowing their minds. If I could run this from end-tasks it would be wonderful, their code would always get pushed to Github automatically, where I could trigger unit tests etc, without having to even mention the words "staging area" to my students.

john-french avatar Oct 29 '21 16:10 john-french

This would unblock me on a number of different issues I've worked around in my own ways.

For starters we currently network a set of dependent workspaces together via Tailscale as Ephemeral Nodes. The issue lies in the fact that despite being "ephemeral" Tailscale doesn't perform housekeeping until as long as 48 hours later. This leads to long forgotten/deleted Gitpod instances lingering and causing duplicate hostnames.

Tailscale's official recommendation when you need a consistent hostname is to instead remove the node via an API call:

If you're using hostnames to refer to things, and need to have the node deleted as part of your workflow, then you can make an API call from your automation system.

Here is the API spec for that

Other than that, the cleanup of ephemeral nodes is a bit lazy, but they shouldn't linger more than a day or two.

With an end task in Gitpod we could automate that API call to clean-up the tailscale node before shutting down.


On a different note our developers often are working with persistent data in MySQL and even though it does persist through workspace restarts thanks to storing the data under the /workspace/mysql path there are times where one developer wishes they could share the latest DB dump with someone else. We've crudely solved this by a script to dump the database to cloud storage and another script to restore from a named DB dump. Sometimes we have team members forget to run this though and they lose the latest copy of their data.

End tasks could help here as well by automatically calling the script to have an autosave backup just in case.

joepurdy avatar Feb 18 '22 02:02 joepurdy

Our projects heavily rely on cloud infrastructure for development.

I see this feature extremely useful for cleaning up the associated cloud infrastructure on start tasks we create AWS services instances using Terraform/Ansible - database, event bus, storage, stream processor etc on end tasks we would destroy those instances

jetdream avatar Feb 18 '22 10:02 jetdream

We would like this feature as well. For us, the primary benefit would be cleaning up caches that are left around the workspace. This leads to workspaces taking a very long time to come back up (or, often, never coming back up at all). If we could hook an end task, automating a clean of the workspace would be trivial.

jkaye2012 avatar Mar 19 '22 15:03 jkaye2012

Thank you all for your input. I noticed that this wasn't in a team inbox, @shaal so I've just put it in the WebApp team.

WebApp team - if this doesn't belong to your team, can you move it to the correct inbox? I thought this may be one of those that requires input from every eng team... Your call!

pawlean avatar Mar 20 '22 09:03 pawlean

In today's workspaces, prior to a regular shutdown all processes receive SIGTERM. They then have 15 seconds time before receiving SIGKILL. That's not quite as convenient as shutdown hooks, but helps to e.g. flush a DB to disk prior to shutdown.

csweichel avatar Mar 31 '22 07:03 csweichel

Interesting, one way to simulate a generic shutdown hook then would be to implement a daemon that we run locally that sleeps until it receives the SIGTERM, then fires the shutdown logic, correct?

jkaye2012 avatar Mar 31 '22 13:03 jkaye2012

@akosyakov Pulling you in because this is more of supervisor territory; webapp would just provide/approve the additions to the config.

Does this seems reasonable to implement?

geropl avatar Apr 07 '22 07:04 geropl

I find the terminology confusing, i.e start vs end tasks. I think there are just tasks (shell sessions) and then there should be some shutdown protocol for them, i.e. it could be something based on signals on top of https://github.com/gitpod-io/gitpod/issues/3966#issuecomment-1084201788 with specified time guarantees, like shutdown timeout. Maybe 15 seconds already enough by the way.

Pulling you in because this is more of supervisor territory; webapp would just provide/approve the additions to the config. Does this seems reasonable to implement?

I'm not sure whether should be an extension of .gitpod.yml.

akosyakov avatar Apr 07 '22 12:04 akosyakov

From our perspective, this would be something that we would like in .gitpod.yml.

We have been able to successfully emulate this using signal handlers, but the result is a bit of a kludge as you end up with a script running for the entire lifetime of your pod just to catch a signal. It works well enough, could just be simpler to use is all.

jkaye2012 avatar Apr 07 '22 16:04 jkaye2012

The original thought behind the feature request was being able to run tasks before the workspace shuts down. Some tasks might take longer to run (ie. making a copy of the current database, and uploading it to the cloud).

It would be straight forward to define these tasks in .gitpod.yml in its own section (ie. pre-shutdown)

shaal avatar Apr 07 '22 17:04 shaal

ok, makes sense, so you mean something like:

tasks: 
  - shutdown: sh ./scripts/shutdown.sh

Which is executed on SIGTERM event and has 15 seconds to complete?

akosyakov avatar Apr 08 '22 07:04 akosyakov

Which is executed on SIGTERM event and has 15 seconds to complete?

@akosyakov I think it would be better to have more than 15 seconds when done from .gitpod.yml 👀

axonasif avatar Apr 13 '22 17:04 axonasif

Adding it to tasks is problematic as those terminals might not even exist anymore or waiting for a command to return. Also, this would make the time when such a command is executed a little fuzzy (before all terminals stop or one by one?). So I think we should introduce a top-level hook for this:

tasks:
  - command: |
         start-db &

onWorkspaceStop: |
   stop-db
   sync-state.sh

The command would run after all terminals have been closed but before? the IDE has stopped. It should run in the same context as all other commands run (i.e. as gitpod user, in a shell, etc.).

svenefftinge avatar Jul 11 '22 14:07 svenefftinge

@svenefftinge I like it!

shaal avatar Jul 11 '22 14:07 shaal

Adding to IDE team sync next week to see if we can pick up the open PR and get it completed.

No promises on timeline, but we'll take a look!

Related internal thread.

loujaybee avatar Sep 23 '22 12:09 loujaybee

Relevant discussion: https://github.com/gitpod-io/gitpod/pull/11287#issuecomment-1190241322

csweichel avatar Sep 25 '22 14:09 csweichel

Removing from IDE sync, as looks like @svenefftinge is looking into this ! 🙏 🚀

loujaybee avatar Sep 27 '22 15:09 loujaybee

I read that some users had success by using SIGTERM. Can you post a link or some instructions on how you do it?

karpa avatar Oct 01 '22 19:10 karpa

Hey @karpa, until #11287 is deployed, you can use this snippet on your .gitpod.yml:

tasks:
  - name: Shutdown daemon
    command: |
      function shutdown() {
        # Do stuff here, for example
        docker-compose stop;
      }

      trap 'shutdown; exit' SIGTERM;
      printf '\033[3J\033c\033[3J%s\n' 'Waiting for SIGTERM ...';
      exec {sfd}<> <(:);
      until read -t 3600 -u $sfd; do continue; done;

axonasif avatar Oct 02 '22 01:10 axonasif

@karpa here is an example of how one can set it up till we have shutdown commands: https://github.com/akosyakov/gitpodify-docker-compose/blob/docker-compose_check/.gitpod.yml

akosyakov avatar Oct 05 '22 17:10 akosyakov

@svenefftinge on holidays, so took it out of progress for now.

geropl avatar Oct 14 '22 10:10 geropl