cluster-api-provider-aws
cluster-api-provider-aws copied to clipboard
Unclear default behavior of SecureSecretsBackend
/kind bug
What steps did you take and what happened: [A clear and concise description of what the bug is.] Im still new to CAPI/CAPA so maybe I'm misunderstanding something. Is related to https://github.com/kubernetes-sigs/cluster-api-provider-aws/issues/2064
From the description it seems that by default it should use the Secrets Manager: https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/634f30eb5a3ab42ec21e358cfb86da740784aed7/api/v1alpha3/awsmachine_types.go#L167-L172
But I'm unclear on how it's being used.
- Deployed a cluster using default config from the quickstart.
- In the bootstrap cluster I see the
cluster.x-k8s.io/secretsecrets contain all the Kubernetes certs. (Should these still be here, I thought thats what the backend would be used for?) - The
AWSMachineTemplateresources don't have the defaultSecureSecretsBackendset. - If I go to the AWS console I don't see the secret.
- But when I deployed into a private subnet with no NAT Gateway the instance failed to come up until I added a
secretsmanagerVPC endpoint for the init script.
What did you expect to happen: Either Secrets Manager is used to load Kubernetes certs by default and not passed in the cloud-init data, or if the default is to not use an AWS Secrete Manager there should not be an init script.
Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.] I also tried to explicitly set the value, but still seeing the certs in the Secret and no secretes in the AWS Secretes Manager.
cloudInit:
secureSecretsBackend: secrets-manager
Environment:
CAPA v0.6.4
You will see secrets in Kubernetes - that's an expected consequence of how Cluster API works. What we're really doing is not storing the certificates in the EC2 userdata. The problem being solved here is that if you have read only access to the EC2 console, then you can reconstruct the Kubernetes CA certificate independently of any RBAC set on the Kubernetes cluster running Cluster API.
You won't see any secrets in Secrets Manager until CAPA starts provisioning an EC2 instance, and if the VPC endpoint isn't available, then this is necessarily going to break. In addition, secrets are deleted ASAP after machine provisioning, so the lifetime is often < 1 minute.
We can clarify the documentation here though.
/kind documentation
Thanks for the explanation @randomvariable! Yes I think some additional docs would be great.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
/lifecycle frozen
@dkoshkin did you find any documentation for this?
Please can you explain a bit more on what exactly is added a secretsmanager VPC endpoint for the init script.
/remove-lifecycle frozen
@nehatomar12
For my setup I deployed the bootstrap cluster into a private Subnet (with no InternetGateway or other routing setup to reach the Internet). When did happens the AWS APIs also become inaccessible. To get around this and have CAPA be able to reach the AWS APIs I created some VPC Endpoints.
Here is what that would look like in Terraform for secretsmanager.
resource "aws_vpc_endpoint" "secretsmanager" {
vpc_id = aws_vpc.my_vpc.id
service_name = "com.amazonaws.us-west-2.secretsmanager"
vpc_endpoint_type = "Interface"
security_group_ids = [aws_security_group.private.id]
// the bastion machine will be accessing this endpoint
subnet_ids = [aws_subnet.public.id]
private_dns_enabled = true
tags = var.tags
}
Similarly I created VPC endpoints for:
resource "aws_vpc_endpoint" "ec2"
resource "aws_vpc_endpoint" "elasticloadbalancing"
resource "aws_vpc_endpoint" "autoscaling"
resource "aws_vpc_endpoint" "secretsmanager"
resource "aws_vpc_endpoint" "ssm"
resource "aws_vpc_endpoint" "ssmmessages"
resource "aws_vpc_endpoint" "ec2messages"
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue with
/reopen - Mark this issue as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue with
/reopen- Mark this issue as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.