gnomad-browser
gnomad-browser copied to clipboard
GKE Upgrades blocked on due to use of deprecated APIs
It looks like our GKE cluster running gnomad won't auto-upgrade to the kubernetes 1.22 release, because some resources we have deployed are using deprecated APIs:
data:image/s3,"s3://crabby-images/901ef/901ef35b8093e90dc028fdc55e245a3ef374ce19" alt="Screen Shot 2022-07-08 at 9 33 36 AM"
Still have to track down what that ingress object is, but I'm pretty sure the validating webhook is from the Elastic/ECK operator. Just looking quickly at Elastic's EOL policies, it looks like both the version of the ECK operator (1.2.1), and the version of elasticsearch itself that we're running (6.8.x) were both deprecated in January.
I think we need to assess what our compatibility with elasticsearch 7.17+ are, and plan an upgrade using a new version of the ECK operator. I don't believe that the stopped GKE upgrades are critical at the moment -- the EOL date for the GKE 1.21 is in December 2022, so I believe we have until at least then before they force an upgrade on us.
Thanks for noticing this Steve. If EOL is this year we should definitely prioritize an ES upgrade soon (https://github.com/broadinstitute/gnomad-browser/issues/929).
Hi @sjahl, I'm new to gnomAD and trying to get a deployment running. I've encountered a related issue with a new deployment:
Using deployctl
for a new deployment on GCP will instantiate a cluster with k8s version 1.22 (labeled as stable by GCP as of Nov 2022) which has removed the API apiextensions.k8s.io/v1beta1
in favour of apiextensions.k8s.io/v1
. After that, when configuring the cluster for Elasticsearch, deployctl
uses ECK version 1.2.1 which is very outdated and still relies on the deprecated apiextensions.k8s.io/v1beta1
API. ECK started supporting k8s 1.22 in version 1.7.x and they're now at 2.5.0! Also in version 1.7.0 they switched from a single all-in-one.yaml
manifest to 2 separate files for custom resources (crds.yaml
) and operator (operator.yaml
).
Anyway, as we proceed with our deployment we'll be investigating this issue further and probably trying to patch deployctl
. Maybe this can become a standalone issue? If so I'm happy to take it on and make a PR if that helps.
Hi @ammazzaw -- we are planning to upgrade both ECK and Elasticsearch itself relatively soon; it's pretty close to the top of my priority stack right now: https://github.com/broadinstitute/gnomad-browser/issues/929. The big open question for us at the moment is whether elasticsearch 7 or 8 include any changes that break gnomAD, and we're working on a plan to test those. I'd welcome any feedback you have in that area as you get something stood up.
In the grand scheme of things, I'd like to deprecate deployctl in favor of more standard deployment tooling (e.g. Terraform and Helm/Kustomize), so I'm not placing a high priority on patching deployctl itself. But, I'm happy to review and consider any patches that you have. The only constraint is that we can't really let the deployctl script advance too far beyond the official gnomAD browser deployment.
The below patch allows deployment with the current code base:
diff --git a/deploy/deployctl/subcommands/setup.py b/deploy/deployctl/subcommands/setup.py
index 219a25a8..94a501b7 100644
--- a/deploy/deployctl/subcommands/setup.py
+++ b/deploy/deployctl/subcommands/setup.py
@@ -210,7 +210,7 @@ def create_cluster() -> None:
f"--zone={config.zone}",
"--release-channel=stable",
"--enable-autorepair",
- "--enable-autoupgrade",
+ "--cluster-version=1.21.14-gke.3000",
"--maintenance-window=7:00",
f"--service-account={config.gke_service_account_full_name}",
f"--network={config.network_name}",
Gonna close this -- the prod gnomAD cluster has been updated to ECK 2.5.0, which should resolve the outstanding API deprecation warnings. deployctl changes are in #1039