bitte icon indicating copy to clipboard operation
bitte copied to clipboard

fix: key usage attrs on intermediate issuer

Open johnalotoski opened this issue 3 years ago • 0 comments
trafficstars

WARNING! This may be disruptive to jobs as this changes the pki issuer; care is needed in following the steps below.

  • Note: This commit and procedure do not apply to prem clusters and only apply to AWS clusters

    • Prem clusters can skip this workflow.
  • This commit fixes broken crl signing capability of the intermediate ca and other missing issuer key features.

    • The prior key definition of allowed_uses was not correct format and effectively a no-op.
  • With this commit's change, the process below needs to be performed in to actually rotate the issuer with corrected key permissions.

  • Example cmd to determine if your cluster is affected -- with admin vault creds, issue:

# Check ability to CRL rotate
❯ vault read pki/crl/rotate
<...snip...>
error encountered during CRL building: error building CRLs: 
  unable to build CRL for issuer ($ISSUER): 
  error creating new CRL: x509: issuer must have the crlSign key usage bit set

# Or examine intermediate signing key usage properties by first obtaining the default issuer:
vault read pki/config/issuers

# Then using the $DEFAULT_ISSUER from the preceding command, check key usage extensions:
# Example output shown would be for an affected cluster:
❯ vault read -format=json pki/issuer/$DEFAULT_ISSUER  | jq -r '.data.certificate' | openssl x509 -noout -ext keyUsage
No extensions in certificate
  • General procedure to update (vault admin or root token required for some commands below):
# View the structure of the ca_chain for the default pki issuer.
# With our standard aws config we expect to see two certs shown in the ca_chain.
# We also expect to see the same structure (with a different first cert) at the end of this procedure.
❯ curl -s -H "X-Vault-Token: $VAULT_TOKEN" $VAULT_ADDR/v1/pki/ca_chain

# List existing vault issuers -- with our standard config, we expect to see two for the pki/ secrets mount
❯ vault list pki/issuers

# The existing two standard pki/ mount issuers need to be deleted to be updated.
# If left undeleted, the ca_chain endpoint will not have two certs at the end of this procedure
#
# NOTE: If you see more than two issuers, your cluster has some custom pki issuers created.
#             Make sure you know which ones you should *not* delete before proceeding!
#
# Repeat this cmd for each of the two standard pki/ mount issuers:
❯ curl -XDELETE -s -H "X-Vault-Token: $VAULT_TOKEN" $VAULT_ADDR/v1/pki/issuer/$ISSUER

# Verify no standard pki/ mount issuers remain
❯ vault list pki/issuers

# Force deletion of existing intermediate cert signing request from TF state for the pki/ mount.
# This will ensure the next TF plan for hydrate-cluster re-requests an intermediate signing CSR,
# locally signs a new intermediate with fixed key usage attributes and sets this new intermediate
❯ nix run .#clusters.$CLUSTER.tf.hydrate-cluster.terraform -- \
  state rm -state=terraform-hydrate-cluster.tfstate \
  vault_pki_secret_backend_intermediate_cert_request.issuing_ca

# Now plan an update for hydrate-cluster TF workspace:
# Expect to see creation of resource:
#   vault_pki_secret_backend_intermediate_cert_request.issuing_ca
# Replacement of resources: 
#   tls_locally_signed_cert.issuing_ca
#   vault_pki_secret_backend_intermediate_set_signed.issuing_ca
❯ nix run .#clusters.$CLUSTER.tf.hydrate-cluster.plan

# Then apply after reviewing the changes are acceptable:
❯ nix run .#clusters.$CLUSTER.tf.hydrate-cluster.apply
  • Once the update above is done, verify the change has been properly applied in vault:
# Once TF applied, check the number of standard pki/ mount issuers; there should be two:
❯ vault list pki/issuers

# Check the default pki/ mount issuer -- this will return one of the two standard pki/ mount issuer uids from the preceding cmd:
❯ vault read pki/config/issuers

# View the new intermediate signer key usage properties for the pki/ mount (example output shown):
❯ vault read -format=json pki/issuer/$DEFAULT_ISSUER  | jq -r '.data.certificate' | openssl x509 -noout -ext keyUsage
X509v3 Key Usage: critical
    Digital Signature, Key Encipherment, Certificate Sign, CRL Sign

# Finally check the ca_chain output and make sure the structure of two certificates is shown:
❯ curl -s -H "X-Vault-Token: $VAULT_TOKEN" $VAULT_ADDR/v1/pki/ca_chain

The vault agent on each Nomad client machine will pull the updated pki which will get placed in /etc/ssl/certs/ upon expiration of the existing pki cert TTL. Rather than wait for this to happen over a period of a few days, vault-agent metal service can be restarted on each Nomad client for a controlled update, one-by-one, starting with the node(s) that would be least disruptive to any running jobs to ensure that the updated pki switchover is successful before continuing. Consul, nomad and nomad-follower metal services may also need a manual restart after vault-agent is restarted.

johnalotoski avatar Nov 08 '22 03:11 johnalotoski