certificates icon indicating copy to clipboard operation
certificates copied to clipboard

FR: Allow startup with unreacheable `provisioner`

Open LecrisUT opened this issue 3 years ago • 1 comments

Description

If a provisioner cannot be accessed, e.g. OAuth server is down, allow step-ca to boot up with the remaining functioning provisioners. Probably this is already in the new management revamp but it's worth keeping an issue for this. @dopey could you confirm this?

Use case

This is part of my recent hiccups when bootstrapping a fully integrated server after a long power outage. The relevant setup for this issue is:

  • keycloak uses certificates from step-ca ACME with caddy automatically updating the certificates.
  • step-ca uses keycloak's https endpoints for its OAuth provisioner. Probably as a workaround we could link to the internal .well-known without https, but this needs to be tested.
  • After a long outage keycloak's certificate is expired, and step-ca will not boot up because it detects the OAuth has expired TLS. But reversely, the ACME endpoint is not accessible for caddy to update the certificate because step-ca is not booting.

LecrisUT avatar May 27 '21 23:05 LecrisUT

Discussed during a triage meeting and, in short, we agree.

Currently, step-ca caches OIDC well known results at start up and then refreshes them periodically. It should be changed to not request the OIDC details on startup (allowing the CA to load) and then to attempt first load on first use of the provisioner.

Going into our backlog, but if anyone is looking for a way to contribute we'd happily accept a PR. Please reach out if you're interested.

As a workaround for the original issue, you can remove the OIDC provisioner, wait for the keycloak server to get it's cert from the acme provisioner, then add the OIDC provisioner back. Not ideal, but it will get you unstuck.

This will actually need to be fixed in short order when managed provisioners are mainstream because users will have no way to make changes to provisioners if the CA configuration cannot even startup. (Right now you can just update the json, but we're moving away from that).

@LecrisUT thanks for bringing this to our attention.

dopey avatar Jun 08 '21 19:06 dopey