Move Vercel deployments to Github Actions
https://vercel.com/guides/how-can-i-use-github-actions-with-vercel
Benefits:
- we won't be limited by Vercel single deployment thread, might be much faster
- clear
turbo build-> deploy order - no cache duplication that I had to implement in #2006
- ability to automate
prisma migratewith Github Actions (we could run migration before deploying; that's not possible with Vercel which deploys immediately on commit)
After experimenting with different approaches on #2229, I still think this is a good idea, but there are two problems that don't allow us to implement a perfect solution.
First, we need to detect whether a deployment should happen. On Vercel, turbo-ignore does the right thing by default, by comparing the current commit with the last successful deployment for that branch. In Github Actions, that's much harder to do. We'd have to find the latest Github deployment (separate thing from Vercel deployment — https://github.com/quantified-uncertainty/squiggle/deployments) for this branch by using Github's GraphQL API, and then run turbo-ignore with correct args (or fake Vercel env vars).
Using turbo-ignore without this change would lead to false-negative deployment decisions (deployment won't happen when it should), and using something else instead of turbo-ignore, e.g. paths-filter, won't be enough because dependencies need to be redeployed (e.g. when squiggle-lang changes we want to redeploy the hub too).
If we disable turbo-ignore, it would cause redeployments of all projects on each commit, that's too much overhead.
Second, there are several different ways how to organize the pipeline, and each one has its own tradeoffs.
In our current setup, we run Build, test, lint action on Github, and Vercel runs its own deployment build at the same time. This means that turbo build for each project in Github Action and turbo build on Vercel happen in parallel and they don't share caches.
I thought that I'd be able to run build -> deploy in a clear correct order if we move to Github Actions.
But:
- if we run
Build, test, lintjob first andDeployjob second, it means that deployments have to wait for tests on all projects to finish- also there are situations where we would want to deploy even if tests are failing, e.g. when we do a hotfix or if some minor package tests are broken)
- if we run
Buildjob first, thenDeploy, thenTests, then tests would start much later, and also we won't make full use of turborepo parallelism
What if we did everything in a single Actions job by running turbo run build test migrate deploy instead? Then the process would be as fast as possible; with correct task dependencies in turbo.json config, deployment for components would start as soon as the build for components has finished, etc.
But there are downsides with this approach too:
- workflow summary (example: https://github.com/quantified-uncertainty/squiggle/actions/runs/5845615655) won't show the correct graph, check output would just be "something failed"
- you'd have to look through action logs to figure out what went wrong
- also, this would scale worse CPU-wise; Github Actions run on 2 vCPU machines by default, and we'd have to do everything on a single machine
- also, we'd have to create Github deployments manually through Github API (when a job creates a single deployment, you can define the deployment and its URL in the workflow's YAML; if a job creates multiple deployments, there's no such option)
So there's a conceptual mismatch between Github Actions "static dependency graph" approach, and turborepo's "dynamic dependency graph based on package.json dependencies and turbo.json configs".
This problem is less important than the one with turbo-ignore. We could start the deployment immediately, in parallel with Build, test, lint step, as we did previously with Vercel. (After we run prisma migrate). But I'm still unhappy that there's no way to optimize it further.
For now, I'm going to put this on hold, maybe for a week or two, and check later if I still want to do this or if I have any better ideas.
Other findings for when I get back to it:
- Instead of removing
git_repositoryfrom Vercel project in Terraform configs, deployments can be disabled with git.deploymentEnabled option. vercel buildCLI command will fail ifroot_directoryin configured for a Vercel project; it should be disabled in Terraform configs when we make this change.- It's useful to do
vercel aliasaftervercel deploy; https://vercel.com/guides/how-to-alias-a-preview-deployment-using-the-cli. Btw, we could have nice custom urls such ashttps://preview-pr-1234.squigglehub.orgwith this feature.