oam-kubernetes-runtime
oam-kubernetes-runtime copied to clipboard
Trait apply should use update instead of patch
Problem Statement
Currently, when oam-runtime applies traits, it does MergePatch (code link1 and link2). Basically, it compares the trait to apply (A), and existing trait object in cache (B), and compare difference between A and B to create a patch to send to APIServer.
This will lead to the following problem:
- Trait initialized as:
auto: false
rules: [a, b, c]
- Trait updated as:
auto: true
In above scenario, we expect rules field to be removed, but when we apply it, the trait actually becomes:
auto: true
ruels: [a, b, c]
Proposed Solution
To solve the problem, we should just update the spec of trait instead of patch here. For fields safety concerns, the traits are operational data so it will only be managed by OAM platform. Therefore updating is OK in this case. On the contrary, workloads could be changed by end users and we should consider more about how to update it.
Maybe I can help, could you plz assign this issue to me?
@captainroy-hy sure, thanks!
I wonder why patch can't do "remove"?
@ryanzhang-oss 'patch' will not remove un-related fields
use APIUpdatingApplicator instead of APIPatchingApplicator https://github.com/crossplane/crossplane-runtime/blob/v0.10.0/pkg/resource/api.go#L178-L208
crossplane-runtime 0.9.0 -> 0.10.0
use APIUpdatingApplicator instead of APIPatchingApplicator https://github.com/crossplane/crossplane-runtime/blob/v0.10.0/pkg/resource/api.go#L178-L208
crossplane-runtime 0.9.0 -> 0.10.0
thx for your kindly suggestion!
ref https://github.com/oam-dev/kubevela/issues/876
just to be clear, the real problem is not just to remove the fields that are not present in the new CR but also preserve the fields that are not controlled by the trait.
If this is the real goal then I don't think anyone can solve this unless the trait declares what fields it owns.
@ryanzhang-oss Right.
Let me add another example:
# user applies
auto: false
rules: [a, b, c]
# controller adds `others`
auto: false
rules: [a, b, c]
others: "abc"
# user applies again
auto: true
# user wants to see
auto: true
others: "abc"
```
Per our discussion, this is a well known two-way merge problem: https://medium.com/flant-com/3-way-merge-patches-helm-werf-beb7eccecdfe
The correct way to solve this problem is via three-way merge, which is what kubectl apply does. Basically, we need to emulate kubectl apply in the Apply() function -- even though it is called Apply(), it is not yet.
Nowadays k8s supports server side apply which drastically simplifies the client side implementation. Here are some materials I could find related but they only provide examples dispersedly:
- https://github.com/kubernetes/kubernetes/issues/88091
- https://godoc.org/sigs.k8s.io/controller-runtime/pkg/client
- https://ymmt2005.hatenablog.com/entry/2020/04/14/An_example_of_using_dynamic_client_of_k8s.io/client-go#Using-the-dynamic-client-to-implement-SSA
- https://medium.com/swlh/break-down-kubernetes-server-side-apply-5d59f6a14e26
Server-side apply can solve this problem after k8s v1.19. Before 1.19, it doesn't remove field in such a case shown here
Before 1.19, it doesn't remove field in such a case shown here
I don't get what the comment said. Could you elaborate more detailedly? And provide an example?
@ryanzhang-oss Right.
Let me add another example:
# user applies auto: false rules: [a, b, c] # controller adds `others` auto: false rules: [a, b, c] others: "abc" # user applies again auto: true # user wants to see auto: true others: "abc"
about this case:
because this resource already exists, so if user want remove rules
user should send the following request,
# user applies again
auto: true
rules: null
if from patch to update
# user update again
auto: true
# result
auto: true
this is the semantic of update
update or patch depends on the business logic of traits.
Is there no uniform standard ?
SS apply this deployment
apiVersion: apps/v1
kind: Deployment
...
spec:
replicas: 3
template:
spec:
containers:
- name: nginx
image: nginx:1.14.2
SS apply below with the same field manager
apiVersion: apps/v1
kind: Deployment
...
spec:
# replicas: 3 <===== unset
template:
spec:
containers:
- name: nginx
# image: nginx:1.14.2 <===== unset
In api-server v1.19, we will get
apiVersion: apps/v1
kind: Deployment
...
spec:
replicas: 1 # <===== reset to default value
template:
spec:
containers:
- name: nginx
# image: nginx:1.14.2 # <===== removed because it has no default value
But in api-server v1.16-v1.18, we will get
apiVersion: apps/v1
kind: Deployment
...
spec:
replicas: 3 # <===== unchanged
template:
spec:
containers:
- name: nginx
image: nginx:1.14.2 # <===== unchanged
Related to this fix merged into k8s v1.19
Is there no uniform standard ?
@allenhaozi This is what three-way merge patch handles. This is how kubectl apply is implemented. We are going to align with upstream to fix the problem correctly.
@captainroy-hy I see. If this is well known upstream issue, we don't need to care. All we need to do is to implement kubectl apply on client side. On server side, it is the users who choose what versions (and thus what behaviors) they want.
@ryanzhang-oss Right.
Let me add another example:
# user applies auto: false rules: [a, b, c] # controller adds `others` auto: false rules: [a, b, c] others: "abc" # user applies again auto: true # user wants to see auto: true others: "abc"
I don't think any patch method can solve this problem without what @allenhaozi described. The fact that the trait owns "auto" and "rules" but not "others" field has to be explicitly stated somewhere. This is not k8s related.
@ryanzhang-oss
I don't think any patch method can solve this problem without what @allenhaozi described.
That's wrong assumption. I believe this is what kubectl apply address.
working on this PR, refactor apply mechanism to compute a three-way diff in client side, just like how kubectl apply does under the hood, the last-applied-state is stored in annotation.
Some related issue w.r.t. client side implementation of kubectl apply:
- https://github.com/kubernetes/kubernetes/issues/15894
- https://github.com/kubernetes/kubernetes/issues/17333
- https://github.com/kubernetes/enhancements/blob/master/keps/sig-api-machinery/555-server-side-apply/README.md#motivation
Note that this is an issue with kubectl since it is controlled by users directly. We can avoid and control the risk by isolating it from end users.
For now we can implement 3 way merge on client side by using our own keyword in annotation app.oam.dev/last-applied.
Once k8s 1.19 is more popular, we can switch the implementation to make use of server side apply.
I met a problem that when restart oam-addon-sage causes Pods to restart
Pods belongs to the deployment which is controlled by ServerWorkload
oam-addon-sage contains ServerWorkload,TaskWorkload,ManualScaler,...
https://github.com/kubernetes-sigs/controller-runtime/issues/1311
I met a problem that when restart oam-addon-sage causes Pods to restart Pods belongs to the deployment which is controlled by ServerWorkload
@allenhaozi Could you clarify the problem in more details? And why it is related here or maybe create another issue?
In simple word two controllers that control one resource based on the implementation mechanism It's difficult to patch the resource
workload render the deployment
when restart the controllers will leads to the resource's pod restart
For example:
ServerWorkload create a deployment and set replicas to 0
ManualScalerTrait patch the replicas of this deployment from 0 to 3
when restart the controllers
if the deployment has exist we will find the following event
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 13s (x12 over 5h5m) deployment-controller Scaled down replica set simple-web-33-5d759bd4cf to 0
Normal ScalingReplicaSet 12s (x15 over 5h7m) deployment-controller Scaled up replica set simple-web-33-5d759bd4cf to 3
It's a problem of how can Workload and Trait work together to update a resource when can not patch the resource, so mention it.
@allenhaozi A possible workaround: maybe the ServerWorkload controller should show no opinion on .spec.replicas when create/reconcile a Deployment, leaving .spec.replicas to ManualScalerTrait, just like how ContainerizedWorkload controller does.
But of course, it's still a problem to deal with "field management by multiple controllers", as @ryanzhang-oss said in this issue.
Fixed in https://github.com/oam-dev/kubevela/pull/857 Any plan to port it to this repo too?
Hello! how can I help?? I am new here but very curious to learn and explore this, can anyone guide me with what i should start ?
Hi, @Ananya-1106 Welcome! This repo is a more like libariry of OAM implementation, we suggest you to get start from KubeVela and the oam-kubernetes-runtime lib was contained there.