consul-k8s
consul-k8s copied to clipboard
Helm + VaultBackend + ACLs | Bootstarp ACLs and store the bootstrapToken and the replicationToken in Vault
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request. Searching for pre-existing feature requests helps us consolidate datapoints for identical requirements into a single place, thank you!
- Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.
- If you are interested in working on this issue or have submitted a pull request, please leave a comment.
Is your feature request related to a problem? Please describe.
Based on the current documentations, when installing Consul through Helm for a multi-DC federated setup, figuring out the proper way to bootstrap ACLs is seemingly difficult.
This feature will remove the overhead of trying to understand the ACL bootstrap process from the operators and delegate the same to consul-k8s
.
Speaking from personal experience, setting up the multi-DC Consul service mesh without ACLs took a lot of time. With the recent features added, the setup became easier and I have reached a stable configuration without the ACLs. I do appreciate the great effort that went in to make the deployments via Helm charts easier to configure. 👏👏👏
Although, now that I am trying to enable ACLs, while using Consul Helm charts, the documentation is not clear enough. Documentations on the purpose of the bootstrap token and the replication token, along with some info on how they are used under-the-hood (that is, what happens during the process of bootstrapping of multiple DCs in a federated mesh, and where do the bootstrap and replication tokens play roles) will really help.
But for now, if this feature described below is available, operators do not have fret about the ACL bootstrapping and can simply move on to utilise the ACLs directly. Operators can just pass the secret name and key of replication token in the values.yaml
for all the secondary DCs. And as for the bootstrap token, once the setup is complete, operators can simply fetch it from Vault in order to further configure the ACLs properly.
Feature Description
In a primary DC, if Vault backend secret names and keys for the bootstrapToken
and the replicationToken
are provided but their contents are empty, bootstrap the ACLs as if the tokens were not provided, and write the generated tokens to the provided Vault secret paths respectively.
Details:
For the primary DC, if the following is provided in the values.yaml
for the Helm chart installation:
Note: Only showing fields as required in the context.
global:
secretsBackend:
vault:
enabled: true
manageSystemACLsRole: consul-server-acl-init
acls:
manageSystemACLs: true
bootstrapToken:
secretName: consul/data/secrets/bootstrap-token
secretKey: token
createReplicationToken: true
replicationToken:
secretName: consul/data/secrets/replication-token
secretKey: token
but the value of the referred secrets are empty strings, i.e., the secrets have been created with the following set of commands:
vault kv put consul/secrets/boostrap-token token=""
vault kv put consul/secrets/replication-token token=""
then the server-acl-init
job should bootstrap the ACLs as if the secret names and keys were not provided at all.
To rephrase, the server-acl-init
job should treat the above config the same as the following:
global:
secretsBackend:
vault:
enabled: true
manageSystemACLsRole: consul-server-acl-init
acls:
manageSystemACLs: true
createReplicationToken: true
Once the bootstrapping is done for the primary DC, the bootstrapToken
and the replicationToken
generated during the process, should be written to Vault at the provided secret paths respectively. That is, the equivalent of the following code should be executed:
vault kv put consul/secrets/boostrap-token token="${bootstrap_token}"
vault kv put consul/secrets/replication-token token="${replication_token}"
This also means that the role for manageSystemACLsRole: consul-server-acl-init
should have write permissions to the 2 Vault secret paths in context. Although, these write permissions will be only required in the primary DC, while for secondary DCs read permissions will be enough.
DC1
vault policy write consul-server-acl-init-dc1-policy - <<-EOF
path "consul/data/secrets/bootstrap-token" {
capabilities = ["read", "update"]
}
path "consul/data/secrets/replication-token" {
capabilities = ["read", "update"]
}
EOF
vault write auth/kubernetes-dc1/role/consul-server-acl-init \
bound_service_account_names=consul-server-acl-init \
bound_service_account_namespaces=consul \
policies=consul-server-acl-init-dc1-policy \
ttl=1h
DC2
vault policy write consul-server-acl-init-dc2-policy - <<-EOF
path "consul/data/secrets/replication-token" {
capabilities = ["read"]
}
EOF
vault write auth/kubernetes-dc2/role/consul-server-acl-init \
bound_service_account_names=consul-server-acl-init \
bound_service_account_namespaces=consul \
policies=consul-server-acl-init-dc2-policy \
ttl=1h
vault write auth/kubernetes-dc2/role/consul-server \
bound_service_account_names=consul-server \
bound_service_account_namespaces=consul \
policies=consul-server-acl-init-dc2-policy \
ttl=1h
Contributions
If the feature seems valid and is approved, I'll be glad to contribute for this and raise a PR.
@Sushobhan123, thank you for this suggestion. I think it sounds great. @david-yu, what do you think about the feature?
It reminds me of the work done for automating gossip encryption. If you do end up implementing this solution, these two pull requests may be helpful as a reference: https://github.com/hashicorp/consul-k8s/pull/738, https://github.com/hashicorp/consul-k8s/pull/772.
Hi @Sushobhan123 we were considering on adding some Consul K8s CLI enhancements to make the process of quickly bootstrapping federated services even simpler.
Traditionally we have not tried to introduce write
or update
access to such roles that involve Consul K8s making API calls to the Vault secrets backend because creating and updating such secrets should be handled by an operator with escalated privileges and not a long lived process.
I'm also open to seeing what others who watch Consul K8s issues are thinking. We believe implementing automation for bootstrapping secure secrets like the bootstrap and replication token would be better suited in the CLI to improve the UX for the setup of Consul K8s with Vault, given the sensitivity of such secrets in Vault.
@t-eckert Thank you for the references. I see that in https://github.com/hashicorp/consul-k8s/pull/738, the work was first done using curl
, and then written using Go in https://github.com/hashicorp/consul-k8s/pull/772.
Just for clarity, what I meant by "the equivalent of the vault kv put
commands should be executed", is that the equivalent code should be written using Go in control-plane/subcommand/server-acl-init/command.go
.
we were considering on adding some Consul K8s CLI enhancements
@david-yu Is the sentence below a correct paraphrasing of the above?
"We were considering on writing the code using Vault Go client, and not curl
or vault cli
"
creating and updating such secrets should be handled by an operator with escalated privileges and not a long lived process
@david-yu Given that the server-acl-init-job
is not a long lived process, if the functionality is added in control-plane/subcommand/server-acl-init/command.go
(or a new file under control-plane/subcommand/server-acl-init/
, if that is preferred), to handle the vault writes implicitly, shouldn't that be enough?
Or do you mean to introduce a new subcommand, when you say "adding some Consul K8s CLI enhancements"?
Please share your thoughts.
P.S.: I'll be waiting for a Go ahead from you guys before I start any work on this. Also, if any work is already being done by the team on the same, do let me know. I'll skip it then.
btw, I got an error when try to bootstrap the consul server acl, when using vault as the backend, using latest helm chart 0.43.0
maybe the issue is related to this feature,
global:
secretsBackend:
vault:
enabled: true
consulServerRole: "consul-server"
consulClientRole: ""
manageSystemACLsRole: "consul-server-acl-init"
agentAnnotations: null
consulCARole: "consul-ca"
ca:
secretName: "vault-ca-cert"
secretKey: "tls.crt"
connectCA:
address: "https://vault-active:8200"
authMethodPath: "kubernetes"
rootPKIPath: "connect_root"
intermediatePKIPath: "connect_inter"
additionalConfig: |
{}
acls:
bootstrapToken:
secretName: secret/data/consul/bootstrap-token
secretKey: token
createReplicationToken: false
manageSystemACLs: true
partitionToken:
secretName: secret/data/consul/partition-token
secretKey: token
replicationToken:
secretName: secret/data/consul/replication-token
secretKey: token
it will produce an error like :
2022-04-26T06:12:29.888Z [ERROR] Failure: calling /agent/self to get datacenter: err="Unexpected response code: 403 (ACL not found)"
2022-04-26T06:12:29.888Z [INFO] Retrying in 1s
2022-04-26T06:12:30.889Z [ERROR] Failure: calling /agent/self to get datacenter: err="Unexpected response code: 403 (ACL not found)"
2022-04-26T06:12:30.889Z [INFO] Retrying in 1s
2022-04-26T06:12:31.892Z [ERROR] Failure: calling /agent/self to get datacenter: err="Unexpected response code: 403 (ACL not found)"
I guess its because the global.acls.manageSystemACLs
still not supported when using vault as a backend like what @Sushobhan123 explain?
anyway, is the any workaround to bootstrap the consul acl when using vault as the secret backend?
Hi @kholisrag! With the latest release of Consul-k8s 0.43.0
we fully support ACLs being stored in Vault so this should work!
I do not see an obvious error in your values.yaml that you attached, but there are also other things in play like ensuring that the Vault roles/policies are setup correctly which are missing. It should however work! If you'd like please do file a bug and we'll help get you up and running!
yes, its supported, what not supported is bootstraping the consul-server acl when there is fresh install, for consul server that thr acl already bootstrapped, the vault backend work perfectly.
in my case to fix this, I modify the consul-server-acl-init job to upload the bootstraped acl token to vault, using helm post render and kustomize
Hi @kholisrag it does look like what you are looking for is related to this feature request. We currently require you to bootstrap the token manually as described here: https://www.consul.io/docs/k8s/installation/vault/data-integration/bootstrap-token. The WAN Federation workflow with the secrets backend is also fully documented here as well: https://www.consul.io/docs/k8s/installation/vault/wan-federation
We were hoping to do more investigations on the CLI side as opposed to building a Kubernetes job like @Sushobhan123 described to create both the ACL bootstrap and replication tokens as that requires an operator to explicitly execute commands that create secure tokens. We would need to do more investigation into the security impact of running such a process in a Kubernetes job before deciding to go in that direction.
@kholisrag Do you have more details on how you were able to do this? A gist or GitHub repo would be useful to glance at.
in my case to fix this, I modify the consul-server-acl-init job to upload the bootstraped acl token to vault, using helm post render and kustomize
Hi @Sushobhan123, I've implemented a fix for this in https://github.com/hashicorp/consul-k8s/pull/1920 (starting with just the bootstrap token). It's been a little while, but I'm interested in any feedback you might have on that. Thank you!