enhancements
enhancements copied to clipboard
KEP-2839: Add KEP for in-use protection
- One-line PR description: A generic feature to protect objects from deletion while it is in use
- Issue link: https://github.com/kubernetes/enhancements/issues/2839
- Other comments:
Cross-linking an old issue: https://github.com/kubernetes/kubernetes/issues/10179
@lavalamp @bgrant0607
Sharing just a rough idea.
Just for preventing deletion, the below design will be enough (We still need to consider how to protect deletion of LienClaim
itself). Also, update may be implemented in the same way, like adding "spec-update" to LienClaim.spec.prevents
and appending uuid to *.metadata.spec-update-liens
. However, it would be difficult to define preventing update for specific fields as we can specify through "field manager", by this way.
- User creates
LienClaim
in their namespace with specifying targets and prevents.
apiVersion: v1beta1
kind: LienClaim
metadata:
namespace: ns1
name: my-lien
spec:
targets:
- apiVersion: v1
kind: secret
namespace: ns2
name: important-secret
prevents:
- delete
- Lien controller watches
LienClaim
and create cluster-scopedLien
and append its name to*-liens
, ordelete-liens
in the below example, to the target if a newLienClaim
is created. Note that a name forLien
is generated as UUID.
apiVersion: v1beta1
kind: Lien
metadata:
name: 12345678-1234-1234-1234-1234567890ab
spec:
targets:
- apiVersion: v1
kind: secret
namespace: ns2
name: important-secret
source:
namespace: ns1
name: my-lien
prevents:
- delete
apiVersion: v1
kind: Secret
metadata:
namespace: ns2
name: important-secret
delete-liens:
- 12345678-1234-1234-1234-1234567890ab
type: Opaque
data:
- Lien controller may also update
LienClaim
status if 2 succeeds
apiVersion: v1beta1
kind: LienClaim
metadata:
namespace: ns1
name: my-lien
spec:
targets:
- apiVersion: v1
kind: secret
namespace: ns2
name: important-secret
prevents:
- delete
status:
lien: 12345678-1234-1234-1234-1234567890ab
phase: preventing
-
On deletion of resources, lien admission controller checks if
metadata.delete-liens
is empty. If not, return error to block deletion. -
Lien controller watches
LienClaim
and delete the lien from*-liens
on the target and deleteLien
if theLienClaim
is deleted.
From the viewpoint of consuming lien from secret protection, which is my biggest concern. How to create LienClaim
before creating referencing objects and delete LienClaim
after deleting referencing objects is need to be considered.
(I'm not sure if mutation hook or ownerReference can achieve it. Although, we can at least leave it for users.)
Why do you want to have lien objects? Why are text strings not sufficient (as used for finalizers). Adding additional object types adds more places to have races, I'm against it unless you can convince me it's absolutely necessary.
Why do you want to have lien objects?
It's to make it easier to manage who has control over the specific lien in the delete-liens field.
If multiple liens are set to a single resource, we will need to ensure that the specific lien is managed by the client/controller when deleting the lien. We may add the LienClaim
's identity information to delete-liens
but it will tend to be long and difficult to much.
Instead, we might just add LienClaim
's uid. However, just adding uid, it is difficult for the owner of the resource to find from the uid. Also, I'm not sure if object uid can be regarded as unique. Creating an object with the name should ensure it.
Adding additional object types adds more places to have races, I'm against it unless you can convince me it's absolutely necessary.
Agree. If above concern can be solved by the other way, I agree to remove cluster-scoped Lien
object.
You want to ACL liens? I don't see a need to enforce this any more than we do for finalizers. And I would not solve that problem by adding Lien or LienClaim objects. I don't think a solution here should require any additional object types.
If only talking about the use case for secret protection, the feature needed is like block deleting "Secret A" that can be used by "Pod B" and "PersistentVolume C" while the secret is used. So, something like below will be enough.
apiVersion: v1
kind: Secret
metadata:
namespace: ns2
name: A
delete-liens:
- "ID to show that Pod B is using"
- "ID to show that PersistentVolume C is using"
type: Opaque
data:
And to generalize it, I was thinking about a way to manage such ID for each reason to avoid conflict in deleting liens by lien system's side. However, we can leave it to consumers.
And also thinking about using a lien per controller, not per reason, like below.
apiVersion: v1
kind: Secret
metadata:
namespace: ns2
name: A
delete-liens:
- "k8s.io/secret-protection-controller"
type: Opaque
data:
Then, I start to think about which will be easier "implementing own admission hook to block deletion" or "implementing controller to add lien"?
(Obviously, for users who would like to add such protection manually, lien systems is useful as this shows, but for controllers, it might not.)
There is no need for a lien controller. We would add code to kube-apiserver (reject deletion if there are liens, don't permit new liens if deletion timestamp is set).
We might want slightly more structure than raw text.
You can prototype today with e.g. specially formatted annotations and a webhook admission controller.
Sorry, I misread :) A lien per controller or per reason doesn't matter from the perspective of the lien mechanism, it's up to controller authors how they want to use the mechanism.
@lavalamp
You can prototype today with e.g. specially formatted annotations and a webhook admission controller.
I've implemented a prototype of lien as this. It is separated to 4 commits, but only 2 commits, the first commit and the last commit, will be the points of interest.
I will update the KEP based on this prototype.
Note that I've tested below way and work as shown below:
- Deploy:
ENABLE_ADMISSION_PLUGINS=Lien hack/local-up-cluster.sh
- Test:
- Without liens, deletion is not blocked
kubectl create secret generic test-secret --from-literal='username=my-app' --from-literal='password=39528$vdg7Jb'
kubectl get secret test-secret -o jsonpath='{.metadata.liens}{"\n"}'
kubectl delete secret test-secret
secret "test-secret" deleted
- With liens, deletion is blocked, and once all liens are removed, deletion is not blocked
kubectl create secret generic test-secret --from-literal='username=my-app' --from-literal='password=39528$vdg7Jb'
kubectl patch secret test-secret -p '{"metadata":{"liens":["never delete me"]}}' --type=merge
secret/test-secret patched
kubectl get secret test-secret -o jsonpath='{.metadata.liens}{"\n"}'
[never delete me]
kubectl delete secret test-secret
Error from server (Forbidden): secrets "test-secret" is forbidden: deletion not allowed by liens
kubectl patch secret test-secret -p '{"metadata":{"liens":[]}}' --type=merge
secret/test-secret patched
kubectl delete secret test-secret
secret "test-secret" deleted
@lavalamp
I've updated the KEP. PTAL
Also, I will upate the prototype of secret-protection to rely on this in-use protection mechanism to check the feasibility and also update the KEP for secret-protection.
This is close enough that maybe you can come to an API Machinery SIG meeting and pitch it to us? :)
@lavalamp
Thank you for your review and feedback. I will update the KEP.
This is close enough that maybe you can come to an API Machinery SIG meeting and pitch it to us? :)
Of course. Thank you for your invite. (It seems that the meeting is conflicting with that of SIG storage, however, this KEP will need to be merged before secret-protection KEP, so I will prioritize this KEP).
@lavalamp
Updated the KEP. PTAL (For rephrasing "validating admission webhook", used common phrase "validation in api-server".)
mentioned this KEP in #sig-cli xref https://github.com/kubernetes/enhancements/pull/2777
@detiber @neolit123
Thank you for your review. I've addressed the review comments. PTAL
Also, I changed the format for Liens field. Please also check it (In my understanding, content of a map needs to have a unique identifier, therefore I used LienID
as the identifier. However, it makes almost same as just a map of list of string for "Per Object" case). Any ideas to improve it?
@johnbelamaric
PRR approval deadline is approaching. Could you apporve it? If there are anything missing for PRR approval, please point out.
@wojtek-t @thockin cc: @lavalamp @detiber @neolit123 @msau42
Updated the KEP to change back to slice-of-strings Liens and restrict it to per-controller/per-user, to be more aligned with Finalizers. PTAL
I think it's important to work out what kind of permissions should be required in order to add lien. The KEP as currently written allows a namespace admin or editor to create an object that cannot be deleted. The impact here is greater than finalizers because a finalizer doesn't block the deletion itself. Once the deletionTimestamp is set, a finalizer can be removed and cannot be added back. This gives a non-racy path to having an object removed even after finalizers are set.
In contrast, liens would prevent deletion, so a cluster-admin would be racing to update the object to remove the lien and delete the object before a namespace editor is able to place the lien back.
One of our regrets with finalizers is the lack of separation between managing finalizers and updating the intent of the resources themselves. I think we should avoid making that mistake again and create a separate ACL model (possibly a subresource) to modify liens. I do not think that a namespace admin or editor should be able to create an object that a cluster-admin or namespace lifecycle controller cannot delete.
@deads2k do you think a subresource (or every resource!!) is a requirement for alpha? Is there (or should there be) a way to add a universal subresource?
@deads2k
Could you comment on whether making lien a subresource will be a requirement for alpha, as asked here?
I've added it as a "Beta graduation criteria" with describing the risk to the KEP. Can we go ahead with it? Actually, I don't have enough knowledge on whether we can change it after alpha without a big side effect, so I will wait for your decision.
@deads2k do you think a subresource (or every resource!!) is a requirement for alpha? Is there (or should there be) a way to add a universal subresource?
+1 I think we agree that the mitigation for the risk of "undelatable resources" is a dedicate ACL (subresource). So given we know the solution here, that doesn't sound like an Alpha-blocker for me (it clearly is Beta blocker).
Or do you think think that "generic subresource for even resource" is the tricky part here?
@deads2k - I'm personally supportive for this going alpha in 1.23 with the current risk. So if you believe that the race problem is solvable, I think we should go with it (unless you're not confident that we can reasonable solve the dedicate ACL one).
@deads2k do you think a subresource (or every resource!!) is a requirement for alpha? Is there (or should there be) a way to add a universal subresource?
I do think that we need to resolve permissions situations before creating the alpha level of any feature. In addition, this particular one is build on ObjectMeta is effectively an unversioned API that spans all resources. I think this raises the stakes for getting the initial implementation correct.
The closest current parallel is finalizers and we have never been able to (and probably never will be able) to go back and correct those short-comings.
@deads2k
Thank you for your comment. I understand that making lien field a subresource needs to happen in the first implementation.
Then, the next question is whether we can go ahead by just modifying this KEP to make the lien field a subresource to allow defining a separate RBAC in alpha. Or do we need anything else?
I'm not sure that making a field in metadata a subresource is possible and reasonable, nor making slice-of-string lien a subresource alone can really mitigate the risk. I would appreciate it if you could let me know the direction (I agree to skip this release if there is a blocker that won't be solved in time).
I think that permissions issues could be resolved by a subresource with separate permissions segregated from admin,editor.
I think the guarantees around "blocks deletion" still present a problem overall and may require a way to say, "ignore liens and force deletion" to allow, "no matter what someone else says, this needs to be removed" similar to how deletion+removing finalizers does today.
A design update containing those changes seems significant enough that it's unlikely to complete review today.
@deads2k
A design update containing those changes seems significant enough that it's unlikely to complete review today.
Thank you for your comment. Then, we need to skip this release (I've updated the KEP to just use subresource, but it seems that it won't be enough).
Let's continue discussion to merge it for k8s 1.24, later. Thank you for taking your time to review this KEP, everyone involved. I really appreciate it.
Let's continue discussion to merge it for k8s 1.24, later. Thank you for taking your time to review this KEP, everyone involved. I really appreciate it.
Let's just keep traction on it - we should continue discussion in the following days, not get back to it right before the next feature freeze.
Sorry we couldn't wrap this one up this time around.
I think there's some basic tension around whether this should be an advisory feature, to help prevent accidental deletion, or a hard locking feature which sets privileges around who can or cannot lock / delete.
I'm rooting for an advisory role for liens, which would help a longstanding feature request to make it hard to delete something accidentally. This would imply that anyone can apply a lien, and anyone can ignore them (you just have to kubectl --force --with-crowbar delete
-- since we already used plain old --force
for something else).
OTOH if you want a runtime referential integrity feature, then we have to permission the addition and removal of liens to make sure the wrong parties can't break the referential integrity. This sounds very complicated to me and I don't really want to do it if we don't have to.
Thank you for sharing the alternative idea. Advisory feature or hard locking feature seems a very good summary of the choices.
I think that it will be also a valid direction to mitigate the risk. If we decide to choose this direction, it can be added even after alpha without API changes, so it won't be an alpha blocker. I will update the KEP and wait for others to agree.
(Actually, my original intention is to prevent accidental deletion, therefore this change is not just to make this KEP merged in time, but it fits my use case).