karmada icon indicating copy to clipboard operation
karmada copied to clipboard

feature: moving Karmada resource key information from labels to annotations

Open jwcesign opened this issue 1 year ago • 10 comments

What would you like to be added: Now, Karmada puts the resource owner information into labels, which could cause the label value longer than 63 chars.

So first, let's figure out what's the current situation:

PropagationPolicy max_len(name) = 63 max_len(namespace)=63

Deployment max_len(name)=253 max_len(namespace)=63

After find the matched PP, there will be two labels:

propagationpolicy.karmada.io/name: PropagationPolicy name
propagationpolicy.karmada.io/namespace: PropagationPolicy namespace

ResourceBinding max_len(name)=Deployment's name+kind > 253 (1) max_len(namespace)=Deployment's namespace < 63

After matched, there will be two labels:

propagationpolicy.karmada.io/name: PropagationPolicy name
propagationpolicy.karmada.io/namespace: PropagationPolicy namesapce

Work max_len(name)=Deplyment's name + hash(namepsace-name-kind) > 253 (2) max(namespace)=Deployment's namespace < 63

There will be one label:

resourcebinding.karmada.io/key: hash(binding namspace/name)

Deployment in member clusters max_len(name)=253 max_len(namesapce)=63

There will be following labels:

karmada.io/managed: "true"
propagationpolicy.karmada.io/name: PropagationPolicy name
propagationpolicy.karmada.io/namespace: PropagationPolicy namespace
resourcebinding.karmada.io/key: hash(binding namspace/name)
work.karmada.io/name: Deployment name+hash(namespace-name-kind) (3)
work.karmada.io/namespace: work's ns

(1) It couldn't be longer than 253, there will be errors. (2) It couldn't be longer than 253, there will be errors. (3) It couldn't be longer than 63, there will be errors.

For (1)/(2), there are few possibilities of users using excessively long deployment names. We should provide documentation that indicates the maximum length for workload names.

To address (3), the solution is to transfer all relevant owner information to annotations and include a unique identifier label there(uid), which will be used for label selection during coding, like following:

resourcebinding.karmada.io/uid: 93162d3c-ee8e-4995-9034-05f4d5d2c2b9

The changes roadmap:

  • v1.7.0: Add uid label into the resource to show the owner information, and add owner information into the annotations.
  • v1.8.0: Do not add the owner information label to the resource labels.

Why is this needed:

If users create a workload with a name longer than 63 characters, the workload may not be synchronized to member clusters because of the webhook validation.

/cc @RainbowMango

jwcesign avatar Aug 25 '23 03:08 jwcesign

/assign

jwcesign avatar Aug 25 '23 06:08 jwcesign

should have a document for existing resources after #3899 is merged. How to handle existing resources that have many errors in the controller manager

E0911 05:43:18.420445       1 execution_controller.go:121] Failed to sync work(xxxx) to cluster(test-cluster): resource(kind=Deployment, xxxxx) already exist in cluster test-cluster and the work.karmada.io/conflict-resolution strategy value is empty, karmada will not manage this resource

wu0407 avatar Sep 11 '23 06:09 wu0407

Due to the issue mentioned in https://github.com/karmada-io/karmada/pull/3899#issuecomment-1721047093, in the context of backup and recovery, it cannot be guaranteed that the UID remains consistent during recovery.

Therefore, we need a unique resource ID to indicate resource dependencies, which will be used as a label selector.

Currently, two approaches are being explored:

  1. Generating a resource ID that is consistent with the replicaSet name. In Kubernetes, to avoid replicaSet naming conflicts, Deployment records a collision index. Each time it's created, a name is generated by hashing the pod template and the index, and then the create API is called. If the creation fails, the index is incremented, and a new attempt is made. As a result, replicaSet uniqueness is guaranteed by the API server to avoid hash collisions.

    In Karmada, PP(When it's deleted, the rb will be deleted by label selector with PP resource id) does not create any resources, and it cannot invoke Kubernetes' create API. Furthermore, PPs might be parsed concurrently, making it impossible to guarantee the uniqueness of PP resource ID.

  2. Using a method consistent with Kubernetes to generate a resource ID. Kubernetes generates UID using https://github.com/google/uuid, directly creating UIDs without checking if they exist. The probability of collisions is extremely low (creating trillions of UIDs within a year, and only one would be a duplicate).

https://github.com/google/uuid/blob/47f5b3936c94efb365bdfc62716912ed9e66326f/version4.go#L25C1-L38C35

So, we can directly call this library to generate resource IDs.

@4125 @chaunceyjiang , what do you think?

jwcesign avatar Oct 13 '23 02:10 jwcesign

@whitewindmills too

jwcesign avatar Oct 13 '23 02:10 jwcesign

Using a method consistent with Kubernetes to generate a resource ID. Kubernetes generates UID using https://github.com/google/uuid, directly creating UIDs without checking if they exist. The probability of collisions is extremely low (creating trillions of UIDs within a year, and only one would be a duplicate).

+1. I think this method is meaningful.

However, we have introduced a feature at https://github.com/karmada-io/karmada/pull/4007, so do we need to deprecate this feature?

chaunceyjiang avatar Oct 13 '23 05:10 chaunceyjiang

However, we have introduced a feature at https://github.com/karmada-io/karmada/pull/4007, so do we need to deprecate this feature?

I think we need to do it

jwcesign avatar Oct 13 '23 06:10 jwcesign

@wu0407 @whitewindmills any suggestions?

jwcesign avatar Oct 16 '23 11:10 jwcesign

After talking about this topic in the community meeting. Here is the conclusion:

Overview: image

  1. When Deployment matches one PP, check whether PP has a label called propagationpolicy.karmada.io/id.
    • If it has, copy the label propagationpolicy.karmada.io/id and its value to deployment labels.
    • If it hasn't, create a new id with https://github.com/google/uuid and add the label(propagationpolicy.karmada.io/id: {id_created}) to PP and deployment.
  2. When creating RB(triggered by step 1), create a new ID with https://github.com/google/uuid and add the label(resourcebinding.karmada.io/id: {id_created}) to RB, add copy label propagationpolicy.karmada.io/id and it's values to RB too.
  3. When creating work(triggered by step2), create a new ID with https://github.com/google/uuid and add the label(work.karmada.io/id) to work, and copy the label resourcebinding.karmada.io/id and its value to it.
  4. When syncing deployment to member clusters(triggered by step3), copy the related work's label work.karmada.io/id and its value to it.

So after the ID is added, we can use it as a label selector to find and operate the resources.

To implement this fully, we need to do different things in different releases.

Release v1.8.0

  1. Add new resource ID labels in different resources(PR: https://github.com/karmada-io/karmada/pull/4199).
  2. Delete old uid label(PR: https://github.com/karmada-io/karmada/pull/4199)
  3. Add a new resource ID to the webhook in order to decrease the time required for reconciliation, referring https://github.com/karmada-io/karmada/pull/4199#discussion_r1398731479.(PR: TBD)

Release v1.9.0

  1. Change the logic of finding resources(using the new id label), like finding related works when deleting RB.

Note We should give a notification about the changes: users who use the old label as a label selector, should adapt to the changes.

jwcesign avatar Oct 24 '23 09:10 jwcesign

related doc: https://docs.google.com/document/d/1b-5EHUAQf8wIoCnerKTHiyiSIzsypqCNz0CWUzvUhnI/edit#heading=h.6kowitu3c2hd

jwcesign avatar Mar 27 '24 02:03 jwcesign

/assign

XiShanYongYe-Chang avatar Mar 30 '24 08:03 XiShanYongYe-Chang

We have solved it, please refer to #4711 /close

XiShanYongYe-Chang avatar May 22 '24 09:05 XiShanYongYe-Chang

@XiShanYongYe-Chang: Closing this issue.

In response to this:

We have solved it, please refer to #4711 /close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

karmada-bot avatar May 22 '24 09:05 karmada-bot