terraform-google-lb-http icon indicating copy to clipboard operation
terraform-google-lb-http copied to clipboard

Managed SSL multi-domain update without downtime

Open philsch opened this issue 3 years ago • 5 comments

TL;DR

Changes on managed_ssl_certificate_domains will cause a downtime because the SSL certificate is replaced, this should be prevented.

Terraform Resources

No response

Detailed design

No response

Additional information

When adding a domain to managed_ssl_certificate_domains the SSL certificate will be replaced, because the additional domain needs to be supported by the certificate itself. The lifecycle rule create_before_destroy = true will not make a difference here, because the new certificate in "Provisioning" state is a "ready" resource for terraform (and the certificate need to be added to the LB to be verified).

As the certificate is in provisioning state, all other domains can also not be served via SSL for a while. Curl output example:

curl: (35) error:14004410:SSL routines:CONNECT_CR_SRVR_HELLO:sslv3 alert handshake failure

With the current design this is not preventable. However it is possible to (instead of adding all domains to the SSL certificate) create a certificate per domain and add those certificates to the Load Balancer. I've tested this approach like this:

# HTTPS proxy when ssl is true
resource "google_compute_target_https_proxy" "default" {
  project = var.project
  count   = var.ssl ? 1 : 0
  name    = "${var.name}-https-proxy"
  url_map = local.url_map

  ssl_certificates = compact(concat(var.ssl_certificates, google_compute_ssl_certificate.default.*.self_link, [for c in google_compute_managed_ssl_certificate.default: c.self_link ], ), )
  ssl_policy       = var.ssl_policy
  quic_override    = var.quic ? "ENABLE" : null
}

# ...

resource "google_compute_managed_ssl_certificate" "default" {
  provider = google-beta
  project  = var.project
  for_each = toset(var.managed_ssl_certificate_domains)

  name = "${var.name}-cert-${replace(each.key, ".", "-")}"

  managed {
    domains = [each.key]
  }
}

It look like with this approach the existing domains are still reachable and SSL termination is working while the additional certificate is provisioned.

You can find my test implementation at https://github.com/philsch/terraform-google-lb-http/tree/feat/interruption_free_managed_ssl . Please provide your feedback on this idea. I'll be happy to create a PR if this approach makes sense.

philsch avatar Apr 07 '22 09:04 philsch

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days

github-actions[bot] avatar Jun 06 '22 23:06 github-actions[bot]

bump

philsch avatar Jun 12 '22 11:06 philsch

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days

github-actions[bot] avatar Aug 11 '22 23:08 github-actions[bot]

@philsch sorry for the delay. This seems reasonable although I am curious about any quota constraints.

bharathkkb avatar Aug 18 '22 23:08 bharathkkb

Thanks for writing back on this topic @bharathkkb . You're absolutely right, if I read the GCloud docs correctly this approach limits the number of parallel domains from 100 to 15. I believe most users would still prefer a downtime free deployment as more than 15 parallel domains is probably not that common, however introducing this as a breaking change is also not a good solution.

Do you think splitting the input parameters like managed_ssl_certificate_domains and managed_distinct_ssl_certificate_domains would be a solution for a non breaking change? Do you have any other ideas?

philsch avatar Aug 20 '22 11:08 philsch

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days

github-actions[bot] avatar Oct 19 '22 23:10 github-actions[bot]

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days

github-actions[bot] avatar Dec 26 '22 23:12 github-actions[bot]

Bump

mr-pascal avatar Dec 27 '22 04:12 mr-pascal

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days

github-actions[bot] avatar Feb 26 '23 23:02 github-actions[bot]