aws-cdk icon indicating copy to clipboard operation
aws-cdk copied to clipboard

(aws-eks): KubernetesManifest to support `--prune-whitelist` flag

Open chu-yik opened this issue 3 years ago • 2 comments

Describe the feature

(Basically re-opening https://github.com/aws/aws-cdk/issues/13658)

Allow users to provide a list of resources as the scope of pruning when creating KubernetesManifest construct

Use Case

1, We deploy custom resource such as elbv2.k8s.aws/v1beta1/TargetGroupBinding via CDK8s generated manifest, but these custom resource are not pruned by default without explicitly specifying the --prune-whitelist flag.

(following copied from https://github.com/aws/aws-cdk/issues/13658)

1, Something we only want to prune certain resources. For example, in this discussion, they can use --prune-whitelist to avoid namespace got pruned. It makes --prune safer and more useful in practice.

2, The role runs kubectl may not have list permission on global pv or daemonset, and enabling --prune will throw a permission error, even though there are no resources to prune.

Proposed Solution

Add pruneWhitelist to KubernetesManifest as an array of string which allows users to provide a list of resources to prune on.

Other Information

No response

Acknowledgements

  • [ ] I may be able to implement this feature request
  • [ ] This feature might incur a breaking change

CDK version used

2.28.1

Environment details (OS name and version, etc.)

Mac and AL2

chu-yik avatar Sep 16 '22 21:09 chu-yik

We just ran into this as well - this seems like an easy fix..

diranged avatar Oct 14 '23 16:10 diranged

Adding our support to this. Not only to make things safer, but to make K8s prune all resources CDK might create, by default.

N.B.. the kubectl flag is now --prune-allowlist, not --prune-whitelist.

K8s has a default allowlist of resource types internally, which critically doesn't include the [Cluster]Role[Binding] resources that we create in our EKS stack for initial setup of a cluster, so that a well known set of roles and bindings are available for engineers to assume:

https://github.com/kubernetes/kubernetes/blob/e53f93c7bb9fe28e5de2799da8eb1c62bdd4f4f1/staging/src/k8s.io/kubectl/pkg/util/prune/prune.go#L39-L59

That means whenever we edit this list and remove a resource, it's not actually deleted from the cluster. The K8s docs say:

--prune-allowlist: A list of group-version-kinds (GVKs) to consider for pruning. This flag is optional but strongly encouraged, as its default value is a partial list of both namespaced and cluster-scoped types, which can lead to surprising results.

My kubectl Lambda is also now (on the version for K8s 1.28) giving me a warning that auto-pruning of non-namespaced resources is deprecated and will be removed in a later version.

You wouldn't even (necessarily) need to make this a user-configurable field. A CloudFormation UPDATE call (as opposed to CREATE or DELETE) receives both the old and new versions of the CFn resource (i.e. the manifest), and is also the only place where you'd conceivably expect to be pruning K8s resources. So you could enumerate all the resource types in the old version of the manifest, and set those as the prune allowlist automatically. Or, you could compare old and new manifests and explicitly delete items not in the new one.

I don't think the linked discussion (on the K8s GitHub) about this being used for safety is relevant to CDK. CDK always specifies a prune label used to work out what is eligible to prune, so is already safe and you wouldn't get it trying to delete the kube-system namespace or similar, because CDK doesn't create that namespace to apply the label.

dancmeyers avatar Nov 29 '23 12:11 dancmeyers