control-plane-flow icon indicating copy to clipboard operation
control-plane-flow copied to clipboard

Race conditions with run command

Open justin808 opened this issue 1 year ago • 3 comments

The goal is to offer something like heroku run (docs) and heroku run:detached (docs).

We’ve run into a race condition, and we’d like some advice.

@Rafael Gomes tried to address this in PR 163.

Our current strategy:

  1. Clone the main application standard workload and make a cron workload with as many settings the same as feasible. The new workload name gets the original name plus -runner
  2. When running a job, update the cron workload definition if needed, on the following criteria (code here): a. Image: Sometimes we need to use a newer image than is the deployed image, such as during a release, where we have new code that will have new database migrations. b. CPU and Memory

Thus, we could have a race condition if we’re updating the workload, and we try to kick off several jobs with different settings.
While it would be unfortunate to run some code with the wrong memory or CPU, running with the wrong image would be totally unacceptable.

Previously, we didn’t have this issue as we always made a new workload for each job run. I’m guessing that it’s much cleaner to have one workload configured for these run commands. However, if we need to change the workload and then start a job, there’s a big race condition issue.

I see a couple ways out of this:

  1. We create multiple workloads for job runners, with a name based on the image version, and maybe the CPU and memory. Tricky thing here is when to clean these up when not used. For example, if we need to run a job with the latest image, we will use a workload runner called something like rails-runner-latest. The rails-runner workload would ALWAYS have the same image as the rails workload.
  2. CPLN provides updates the cpln workload cron start command to allow setting the image, cpu, and ram.

justin808 avatar May 12 '24 02:05 justin808

@rafaelgomesxyz @dzirtusss how about creating 2 runners XXXX-runner and XXXX-runner-latest.

At least that would ensure we don't end up running a command on the wrong image ever.

justin808 avatar May 12 '24 02:05 justin808

I guess 2 runners would work, but what if someone wants to run on an older image? The --image option accepts any image, not just current and latest.

Another potential way to solve it would be to check the image of the job upon creation, and if it doesn't match what was specified, we delete the job and create another one, but this may be a bit too complex.

Then there's also the option of using multiple containers in the same workload instead of multiple workloads.

CPLN providing a way to update cpu, memory and image with cpln workload cron start would be the best case scenario.

rafaelgomesxyz avatar May 13 '24 13:05 rafaelgomesxyz

Waiting for Dan at CPLN to tell us if he'll add the options to the workload cron run args.

justin808 avatar May 14 '24 20:05 justin808