crossplane-runtime icon indicating copy to clipboard operation
crossplane-runtime copied to clipboard

AddFinalizer clears Status subobject leading to delayed object readiness

Open gravufo opened this issue 10 months ago • 10 comments

What happened?

We create managed resources using only the Observe management policy and the provider will initially set the Synced condition to True but will not set the Ready condition at all. We can see in the provider logs that the object seems to be successfully reconciled and is requeued according to the configured sync interval. Once the sync interval is hit, the resource becomes ready.

How can we reproduce it?

  1. Create any MR in the cluster with only the Observe management policy. We can easily reproduce this with provider-upjet-azure in any resource, for instance VirtualNetwork with the following spec:
apiVersion: network.azure.upbound.io/v1beta2
kind: VirtualNetwork
metadata:
  annotations:
    crossplane.io/external-name: vnet-test-crossplane
  name: vnet-test-crossplane
spec:
  deletionPolicy: Delete
  forProvider:
    resourceGroupName: rg-test-crossplane
  initProvider: {}
  managementPolicies:
  - Observe
  providerConfigRef:
    name: default
  1. You will see the object become Synced almost immediately, but the Ready condition will stay nil. Also, the whole atProvider struct stays nil. Provider logs will show the following:
2024-12-06T20:57:33Z    DEBUG    provider-azure    Reconciling    {"controller": "managed/network.azure.upbound.io/v1beta1, kind=virtualnetwork", "request": {"name":"vnet-test-crossplane"}}
2024-12-06T20:57:33Z    DEBUG    provider-azure    Connecting to the service provider    {"uid": "bb757cb5-8934-471d-981b-7862513ac9b1", "name": "vnet-test-crossplane", "gvk": "network.azure.upbound.io/v1beta1, Kind=VirtualNetwork"}
2024-12-06T20:57:33Z    DEBUG    provider-azure    Instance state not found in cache, reconstructing...    {"uid": "bb757cb5-8934-471d-981b-7862513ac9b1", "name": "vnet-test-crossplane", "gvk": "network.azure.upbound.io/v1beta1, Kind=VirtualNetwork"}
2024-12-06T20:57:33Z    DEBUG    provider-azure    Observing the external resource    {"uid": "bb757cb5-8934-471d-981b-7862513ac9b1", "name": "vnet-test-crossplane", "gvk": "network.azure.upbound.io/v1beta1, Kind=VirtualNetwork"}
2024-12-06T20:57:34Z    DEBUG    provider-azure    Diff detected    {"uid": "bb757cb5-8934-471d-981b-7862513ac9b1", "name": "vnet-test-crossplane", "gvk": "network.azure.upbound.io/v1beta1, Kind=VirtualNetwork", "instanceDiff": "*terraform.InstanceDiff{mu:sync.Mutex{state:0, sema:0x0}, Attributes:map[string]*terraform.ResourceAttrDiff{\"location\":*terraform.ResourceAttrDiff{Old:\"canadacentral\", New:\"\", NewComputed:false, NewRemoved:true, NewExtra:interface {}(nil), RequiresNew:true, Sensitive:false, Type:0x0}}, Destroy:false, DestroyDeposed:false, DestroyTainted:false, RawConfig:cty.NilVal, RawState:cty.NilVal, RawPlan:cty.NilVal, Meta:map[string]interface {}(nil)}"}
2024-12-06T20:57:34Z    DEBUG    provider-azure    Skipping update due to managementPolicies. Reconciliation succeeded    {"controller": "managed/network.azure.upbound.io/v1beta1, kind=virtualnetwork", "request": {"name":"vnet-test-crossplane"}, "uid": "bb757cb5-8934-471d-981b-7862513ac9b1", "version": "1176637016", "external-name": "vnet-test-crossplane", "requeue-after": "2024-12-07T02:49:57Z"} 
  1. You can either wait for the requeue to trigger or you can restart the provider.
  2. After that, the Ready condition will be set to true.

What environment did it happen in?

Crossplane version: 1.18.0

Additional information

After step-by-step debugging, we have found that the bug seems to come from the AddFinalizer function in the APIFinalizer struct. This struct is used by provider-upjet-azure and probably all upjet based providers (although I didn't validate). The AddFinalizer function calls the kube client's Update function which will apply an object in the kube API, but also return the updated object returned by the API to the caller. This, in turn, nullifies any struct that is not applied by the Update function such as the Status sub-struct.

Here are the relevant parts of the code:

gravufo avatar Dec 06 '24 21:12 gravufo