crossplane-runtime
crossplane-runtime copied to clipboard
AddFinalizer clears Status subobject leading to delayed object readiness
What happened?
We create managed resources using only the Observe management policy and the provider will initially set the Synced condition to True but will not set the Ready condition at all.
We can see in the provider logs that the object seems to be successfully reconciled and is requeued according to the configured sync interval.
Once the sync interval is hit, the resource becomes ready.
How can we reproduce it?
- Create any MR in the cluster with only the
Observemanagement policy. We can easily reproduce this withprovider-upjet-azurein any resource, for instanceVirtualNetworkwith the following spec:
apiVersion: network.azure.upbound.io/v1beta2
kind: VirtualNetwork
metadata:
annotations:
crossplane.io/external-name: vnet-test-crossplane
name: vnet-test-crossplane
spec:
deletionPolicy: Delete
forProvider:
resourceGroupName: rg-test-crossplane
initProvider: {}
managementPolicies:
- Observe
providerConfigRef:
name: default
- You will see the object become
Syncedalmost immediately, but theReadycondition will staynil. Also, the wholeatProviderstruct staysnil. Provider logs will show the following:
2024-12-06T20:57:33Z DEBUG provider-azure Reconciling {"controller": "managed/network.azure.upbound.io/v1beta1, kind=virtualnetwork", "request": {"name":"vnet-test-crossplane"}}
2024-12-06T20:57:33Z DEBUG provider-azure Connecting to the service provider {"uid": "bb757cb5-8934-471d-981b-7862513ac9b1", "name": "vnet-test-crossplane", "gvk": "network.azure.upbound.io/v1beta1, Kind=VirtualNetwork"}
2024-12-06T20:57:33Z DEBUG provider-azure Instance state not found in cache, reconstructing... {"uid": "bb757cb5-8934-471d-981b-7862513ac9b1", "name": "vnet-test-crossplane", "gvk": "network.azure.upbound.io/v1beta1, Kind=VirtualNetwork"}
2024-12-06T20:57:33Z DEBUG provider-azure Observing the external resource {"uid": "bb757cb5-8934-471d-981b-7862513ac9b1", "name": "vnet-test-crossplane", "gvk": "network.azure.upbound.io/v1beta1, Kind=VirtualNetwork"}
2024-12-06T20:57:34Z DEBUG provider-azure Diff detected {"uid": "bb757cb5-8934-471d-981b-7862513ac9b1", "name": "vnet-test-crossplane", "gvk": "network.azure.upbound.io/v1beta1, Kind=VirtualNetwork", "instanceDiff": "*terraform.InstanceDiff{mu:sync.Mutex{state:0, sema:0x0}, Attributes:map[string]*terraform.ResourceAttrDiff{\"location\":*terraform.ResourceAttrDiff{Old:\"canadacentral\", New:\"\", NewComputed:false, NewRemoved:true, NewExtra:interface {}(nil), RequiresNew:true, Sensitive:false, Type:0x0}}, Destroy:false, DestroyDeposed:false, DestroyTainted:false, RawConfig:cty.NilVal, RawState:cty.NilVal, RawPlan:cty.NilVal, Meta:map[string]interface {}(nil)}"}
2024-12-06T20:57:34Z DEBUG provider-azure Skipping update due to managementPolicies. Reconciliation succeeded {"controller": "managed/network.azure.upbound.io/v1beta1, kind=virtualnetwork", "request": {"name":"vnet-test-crossplane"}, "uid": "bb757cb5-8934-471d-981b-7862513ac9b1", "version": "1176637016", "external-name": "vnet-test-crossplane", "requeue-after": "2024-12-07T02:49:57Z"}
- You can either wait for the requeue to trigger or you can restart the provider.
- After that, the
Readycondition will be set totrue.
What environment did it happen in?
Crossplane version: 1.18.0
Additional information
After step-by-step debugging, we have found that the bug seems to come from the AddFinalizer function in the APIFinalizer struct. This struct is used by provider-upjet-azure and probably all upjet based providers (although I didn't validate).
The AddFinalizer function calls the kube client's Update function which will apply an object in the kube API, but also return the updated object returned by the API to the caller. This, in turn, nullifies any struct that is not applied by the Update function such as the Status sub-struct.
Here are the relevant parts of the code:
- AddFinalizer function
- This ends up calling Into from the kube client
- Reconciler's call to AddFinalizer