(aws-eks): KubernetesManifest to support `--prune-whitelist` flag
Describe the feature
(Basically re-opening https://github.com/aws/aws-cdk/issues/13658)
Allow users to provide a list of resources as the scope of pruning when creating KubernetesManifest construct
Use Case
1, We deploy custom resource such as elbv2.k8s.aws/v1beta1/TargetGroupBinding via CDK8s generated manifest, but these custom resource are not pruned by default without explicitly specifying the --prune-whitelist flag.
(following copied from https://github.com/aws/aws-cdk/issues/13658)
1, Something we only want to prune certain resources. For example, in this discussion, they can use --prune-whitelist to avoid namespace got pruned. It makes --prune safer and more useful in practice.
2, The role runs kubectl may not have list permission on global pv or daemonset, and enabling --prune will throw a permission error, even though there are no resources to prune.
Proposed Solution
Add pruneWhitelist to KubernetesManifest as an array of string which allows users to provide a list of resources to prune on.
Other Information
No response
Acknowledgements
- [ ] I may be able to implement this feature request
- [ ] This feature might incur a breaking change
CDK version used
2.28.1
Environment details (OS name and version, etc.)
Mac and AL2
We just ran into this as well - this seems like an easy fix..
Adding our support to this. Not only to make things safer, but to make K8s prune all resources CDK might create, by default.
N.B.. the kubectl flag is now --prune-allowlist, not --prune-whitelist.
K8s has a default allowlist of resource types internally, which critically doesn't include the [Cluster]Role[Binding] resources that we create in our EKS stack for initial setup of a cluster, so that a well known set of roles and bindings are available for engineers to assume:
https://github.com/kubernetes/kubernetes/blob/e53f93c7bb9fe28e5de2799da8eb1c62bdd4f4f1/staging/src/k8s.io/kubectl/pkg/util/prune/prune.go#L39-L59
That means whenever we edit this list and remove a resource, it's not actually deleted from the cluster. The K8s docs say:
--prune-allowlist: A list of group-version-kinds (GVKs) to consider for pruning. This flag is optional but strongly encouraged, as its default value is a partial list of both namespaced and cluster-scoped types, which can lead to surprising results.
My kubectl Lambda is also now (on the version for K8s 1.28) giving me a warning that auto-pruning of non-namespaced resources is deprecated and will be removed in a later version.
You wouldn't even (necessarily) need to make this a user-configurable field. A CloudFormation UPDATE call (as opposed to CREATE or DELETE) receives both the old and new versions of the CFn resource (i.e. the manifest), and is also the only place where you'd conceivably expect to be pruning K8s resources. So you could enumerate all the resource types in the old version of the manifest, and set those as the prune allowlist automatically. Or, you could compare old and new manifests and explicitly delete items not in the new one.
I don't think the linked discussion (on the K8s GitHub) about this being used for safety is relevant to CDK. CDK always specifies a prune label used to work out what is eligible to prune, so is already safe and you wouldn't get it trying to delete the kube-system namespace or similar, because CDK doesn't create that namespace to apply the label.