cluster-api-provider-aws
cluster-api-provider-aws copied to clipboard
Clear out sensitive files after successful bootstrap
/kind feature
We keep around some files around that contain sensitive information that should ideally be removed when all conditions are met for proper cluster creation. I would like to have a high level of confidence that everything from those files has properly been used, and then the file can be safely deleted without impacting future troubleshooting or remediation.
Describe the solution you'd like Delete sensitive files that are no longer used to reduce the protected locations on the given VM.
Anything else you would like to add: Some of this might require changes to Cluster API and not just Cluster API Provider AWS.
Environment:
- Cluster-api-provider-aws version:
- Kubernetes version: (use
kubectl version): - OS (e.g. from
/etc/os-release):
Partially CAPI, partly deleting /etc/cloud-init-secret.yaml from secrets manager script created file.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
This can be remediated with a post kubeadm script, but it's still a good thing for security clean up
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
/triage accepted /priority important-longterm
/lifecycle frozen
/help
Need to understand cloud-init a bit if you're going to pick it up. Do reach out.
@randomvariable: This request has been marked as needing help from a contributor.
Guidelines
Please ensure that the issue body includes answers to the following questions:
- Why are we solving this issue?
- To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
- Does this issue have zero to low barrier of entry?
- How can the assignee reach out to you for help?
For more details on the requirements of such an issue, please see here and ensure that they are met.
If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.
In response to this:
/help
Need to understand cloud-init a bit if you're going to pick it up. Do reach out.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
I will write an ADR for this with @shysank to explain how AWS Systems Manager would be a good fit for this clean up. https://docs.aws.amazon.com/systems-manager/latest/userguide/execute-remote-commands.html
CAPZ has a similar mechanism to run commands in VMs post deployment, called VM extensions. They are not using this mechanism to clean up sensitive data yet, just using it for setting conditions looking at the existence of sentinel file /run/cluster-api/bootstrap-success.complete
Relevant issues: https://github.com/kubernetes-sigs/cluster-api/issues/1739 https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/issues/582 https://github.com/kubernetes-sigs/cluster-api-provider-azure/issues/915
Some locations to check for sensitive info (haven't confirmed the accuracy of below list yet) /var/lib/cloud/instances/i-002dcd6cf5a525f11/cloud-config.txt /run/kubeadm/kubeadm-join-config.yaml /tmp/kubeadm.yaml var/log/cloud-init-logs
/assign
cc @srm09 for CAPV
Just make sure it's toggle-able, since AWS still operates a few locations without AWS Systems Manager
Thanks for the early feedback @voor. Was thinking once this is in, we could benefit from using AWS Systems Manager for collecting more info from the instances in the future.
Then, postKubeadmConfig script might be a solution that'd work everywhere.
Looking at this list, looks like it exists in all regions: https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/
Since AWS Systems Manager support does not exist at every region, we will go with PostKubeadmCommands. Not really need a design.
/unassign
/assign
/remove-lifecycle frozen
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
This issue has not been updated in over 1 year, and should be re-triaged.
You can:
- Confirm that this issue is still relevant with
/triage accepted(org members only) - Close this issue with
/close
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/
/remove-triage accepted