consul-template icon indicating copy to clipboard operation
consul-template copied to clipboard

GCP service account key not being refreshed before TTL

Open lneva-fastly opened this issue 3 years ago • 2 comments

Consul Template version

0.26.0 (also seen on 0.21.0)

Configuration

  • Setting VAULT_ADDR to point to the unix domain socket exposed by vault-agent
    • vault-agent is using approle authentication and it's working well
  • Passing these arguments to consul-template: -consul-retry-attempts=0 -vault-retry-attempts=0
  • Also passing 2 -template "template-path:output-path:shell script + args" arguments
{{- with secret "gcp/key/database-backups" -}}
  {{- index .Data "private_key_data" | base64Decode -}}
{{- end -}}

Vault GCP config:

{
  "ttl": "12h",
  "max_ttl": "12h"
}

GCP role config for database-backups:

{
  "bindings": "resource \"buckets/redacted\" {\nroles = [\"roles/storage.objectAdmin\"]\n}",
  "project": "redacted",
  "secret_type": "service_account_key"
}

Command

see above

Debug output

Haven't gathered debug output yet, will it be helpful?

Expected behavior

  • json service account key should be read from Vault
  • key should be evaluated into the template
  • after some amount of time less than 12 hours, a new key should be generated and the template re-evaluated

Actual behavior

  • json service account key is read from Vault
  • key should is evaluated into the template
  • new key is not generated before 12 hours
  • the script using the key gets authentication errors after 12 hours
  • consul-template occasionally tries to renew the key and shows a message like this:
    • TTL of "12h" exceeded the effective max_ttl of "3h20m6s"; TTL value is capped accordingly
  • eventually consul-template gets an error while renewing, "lease not found", and at that point it finally fetches a new JSON key
    • this happens anywhere from 15 hours to 24+ hours after it got the key
  • cycle repeats, and I've never seen it get a new key before 12 hours

We set the TTL to 12 hours thinking that maybe there's some inherent limitation in the life of GCP service account keys, but now I think that may have been us misinterpreting the documentation. We originally had a longer TTL and consul-template still did not fetch a new key before it expired.

This is severely problematic. We can't seem to successfully use GCP service account credentials with consul-template. I'm not sure what to do other than oauth access tokens, which are highly undesirable for a number of reasons.

lneva-fastly avatar Jul 27 '21 18:07 lneva-fastly

I think this is due to incorrect caching in vault agent. My problem seems to match this comment: https://github.com/hashicorp/vault/issues/8953#issuecomment-656733690

Is consul-template known not to work properly when sent through vault agent?

lneva-fastly avatar Jul 28 '21 15:07 lneva-fastly

Hey @lneva-fastly, thanks for taking the time to report this.

Sorry to say I don't have a ton of experience with this sort of vault-agent, consul-tempalte intertaction so I'm not going to be a ton of help right up front. If you come up with any more details to add, they might help. Kind of curious if you get a response to your question on the vault ticket you linked as well.

I will see what I can do, but at this point I don't have a great idea of where to start and will need to do a good bit of digging into vault-agent and to get a better understanding of it.

Another place you might be able to find more people who use vault agent w/ consul-template is on our discuss forum (https://discuss.hashicorp.com/c/consul). You might consider asking there to see if a community member has come across anything similar.

eikenb avatar Jul 29 '21 22:07 eikenb