cloud-platform icon indicating copy to clipboard operation
cloud-platform copied to clipboard

Remove all alpine linux images in favour of debian:bookworm-slim

Open jaskaransarkaria opened this issue 9 months ago • 1 comments

Background

Alpine linux introduces lots of issues which can cause workflow to break, we have already been stung by this and have had builds randomly break in our cli and recently when developing a jq command we found it would work a way on all non-alpine os and another way on alpine os

By using Alpine, you're getting "free" chaos engineering for you cluster.

Some of it stems from how musl (and therefore also Alpine) handles DNS (it's always DNS), more specifically, musl (by design) doesn't support DNS-over-TCP. Usually, you would not notice this difference, because most of the time a single UDP packet (512 bytes) is enough to resolve hostnames... until it isn't enough and your application (running on Kubernetes) that previously worked completely fine for months suddenly starts throwing "Unknown Host" exceptions for one particular (very critical) hostname. The worst part is that this can manifest randomly, anytime when some external network change causes the resolution of some particular domain to require more than the 512 bytes available in single UDP packet.

ref

👇🏽 Dockerfiles using alpine linux

jaskaran 14:35:42 repos →  find cloud-platform-* -type f -name \* | xargs -n 1 | xargs -I % grep -l alpine %
cloud-platform-environments/cmd/delete-oldsnapshots/Dockerfile
cloud-platform-environments/cmd/compare-namespace/Dockerfile
cloud-platform-environments/cmd/check-terraform-modules-are-latest/Dockerfile
cloud-platform-environments/cmd/push-terraform-module-version/Dockerfile
cloud-platform-go-get-module/Dockerfile
cloud-platform-hammer-bot/slackbot/Dockerfile
cloud-platform-hammer-bot/Dockerfile
cloud-platform-how-out-of-date-are-we/Dockerfile_go
cloud-platform-how-out-of-date-are-we/Dockerfile
cloud-platform-how-out-of-date-are-we/dashboard-reporter/Dockerfile
cloud-platform-how-out-of-date-are-we/reports/documentation/Dockerfile
cloud-platform-how-out-of-date-are-we/reports/helm-releases/Dockerfile
cloud-platform-how-out-of-date-are-we/reports/namespace-usage/Dockerfile
cloud-platform-how-out-of-date-are-we/reports/terraform-modules/Dockerfile
cloud-platform-how-out-of-date-are-we/reports/namespace-costs/Dockerfile
cloud-platform-how-out-of-date-are-we/reports/orphaned-aws-resources/Dockerfile
cloud-platform-how-out-of-date-are-we/reports/orphaned-terraform-statefiles/Dockerfile
cloud-platform-infrastructure/test/docker/curl-jq.Dockerfile
cloud-platform-kuberhealthy-checks/cmd/namespace-check/Dockerfile
cloud-platform-label-pods/Dockerfile
cloud-platform-tools-image/Dockerfile

Definition of done

  • [ ] readme has been updated
  • [ ] user docs have been updated
  • [ ] another team member has reviewed
  • [ ] smoke tests are green
  • [ ] prepare demo for the team

Reference

How to write good user stories

jaskaransarkaria avatar Apr 30 '24 13:04 jaskaransarkaria

Still to discuss the way forward. Marked as blocked for the moment.

Matt-Alinosn avatar May 08 '24 10:05 Matt-Alinosn