squiggle icon indicating copy to clipboard operation
squiggle copied to clipboard

Move Vercel deployments to Github Actions

Open berekuk opened this issue 2 years ago • 2 comments

https://vercel.com/guides/how-can-i-use-github-actions-with-vercel

Benefits:

  • we won't be limited by Vercel single deployment thread, might be much faster
  • clear turbo build -> deploy order
  • no cache duplication that I had to implement in #2006
  • ability to automate prisma migrate with Github Actions (we could run migration before deploying; that's not possible with Vercel which deploys immediately on commit)

berekuk avatar Aug 11 '23 00:08 berekuk

After experimenting with different approaches on #2229, I still think this is a good idea, but there are two problems that don't allow us to implement a perfect solution.

First, we need to detect whether a deployment should happen. On Vercel, turbo-ignore does the right thing by default, by comparing the current commit with the last successful deployment for that branch. In Github Actions, that's much harder to do. We'd have to find the latest Github deployment (separate thing from Vercel deployment — https://github.com/quantified-uncertainty/squiggle/deployments) for this branch by using Github's GraphQL API, and then run turbo-ignore with correct args (or fake Vercel env vars).

Using turbo-ignore without this change would lead to false-negative deployment decisions (deployment won't happen when it should), and using something else instead of turbo-ignore, e.g. paths-filter, won't be enough because dependencies need to be redeployed (e.g. when squiggle-lang changes we want to redeploy the hub too).

If we disable turbo-ignore, it would cause redeployments of all projects on each commit, that's too much overhead.


Second, there are several different ways how to organize the pipeline, and each one has its own tradeoffs.

In our current setup, we run Build, test, lint action on Github, and Vercel runs its own deployment build at the same time. This means that turbo build for each project in Github Action and turbo build on Vercel happen in parallel and they don't share caches.

I thought that I'd be able to run build -> deploy in a clear correct order if we move to Github Actions.

But:

  • if we run Build, test, lint job first and Deploy job second, it means that deployments have to wait for tests on all projects to finish
    • also there are situations where we would want to deploy even if tests are failing, e.g. when we do a hotfix or if some minor package tests are broken)
  • if we run Build job first, then Deploy, then Tests, then tests would start much later, and also we won't make full use of turborepo parallelism

What if we did everything in a single Actions job by running turbo run build test migrate deploy instead? Then the process would be as fast as possible; with correct task dependencies in turbo.json config, deployment for components would start as soon as the build for components has finished, etc.

But there are downsides with this approach too:

  • workflow summary (example: https://github.com/quantified-uncertainty/squiggle/actions/runs/5845615655) won't show the correct graph, check output would just be "something failed"
  • you'd have to look through action logs to figure out what went wrong
  • also, this would scale worse CPU-wise; Github Actions run on 2 vCPU machines by default, and we'd have to do everything on a single machine
  • also, we'd have to create Github deployments manually through Github API (when a job creates a single deployment, you can define the deployment and its URL in the workflow's YAML; if a job creates multiple deployments, there's no such option)

So there's a conceptual mismatch between Github Actions "static dependency graph" approach, and turborepo's "dynamic dependency graph based on package.json dependencies and turbo.json configs".

This problem is less important than the one with turbo-ignore. We could start the deployment immediately, in parallel with Build, test, lint step, as we did previously with Vercel. (After we run prisma migrate). But I'm still unhappy that there's no way to optimize it further.


For now, I'm going to put this on hold, maybe for a week or two, and check later if I still want to do this or if I have any better ideas.

berekuk avatar Aug 13 '23 20:08 berekuk

Other findings for when I get back to it:

  • Instead of removing git_repository from Vercel project in Terraform configs, deployments can be disabled with git.deploymentEnabled option.
  • vercel build CLI command will fail if root_directory in configured for a Vercel project; it should be disabled in Terraform configs when we make this change.
  • It's useful to do vercel alias after vercel deploy; https://vercel.com/guides/how-to-alias-a-preview-deployment-using-the-cli. Btw, we could have nice custom urls such as https://preview-pr-1234.squigglehub.org with this feature.

berekuk avatar Aug 13 '23 20:08 berekuk