provider-gcp
provider-gcp copied to clipboard
NodePool infinitely reconciles
What happened?
Hello!
I was introducing crossplane into my company within the last few weeks. It works great, but one thing I noticed is an Issue with the NodePool
CRD.
I created a Cluster
resource and NodePool
resource simultaneously. This took then a while, which is totally normal. After a few minutes the Cluster
CRD reached the ready state and remained in there. So everything was fine with it.
Then the NodePool
CRD started to create the Node-Pool inside GCP. But as soon as this was created it immediately started wanting to update it again. This caused it to issue ~3000 UpdateNodePool
API-Requests onto GCP during 24h. It would not stop with doing that, even after waiting for a longer time. I was wondering there this behaviour comes from, but I could not find another Issue targeting this problem.
This then had impacts on the cluster availability, as is could never operate problem because it was always in an updating state due to the NodePool constantly updating something.
I expected the CRD to be quiet after it creates the NodePool and only issue UpdateNodePool
requests if something diverges from its spec
How can we reproduce it?
I used the following manifests for the Cluster
and NodePool
:
apiVersion: container.gcp.crossplane.io/v1beta2
kind: Cluster
metadata:
name: test-cluster
spec:
deletionPolicy: Delete
forProvider:
location: europe-west3-a
network: default
maintenancePolicy:
window:
dailyMaintenanceWindow:
# this is GMT, which means, that the actual window start at 23:00
startTime: 21:00
verticalPodAutoscaling:
enabled: true
addonsConfig:
gcePersistentDiskCsiDriverConfig:
enabled: true
initialClusterVersion: latest
releaseChannel:
channel: STABLE
apiVersion: container.gcp.crossplane.io/v1beta1
kind: NodePool
metadata:
name: test-cluster-pool
spec:
deletionPolicy: Orphan
forProvider:
autoscaling:
minNodeCount: 3
maxNodeCount: 5
autoprovisioned: false
enabled: true
clusterRef:
name: test-cluster
config:
machineType: e2-small
diskSizeGb: 10
imageType: cos_containerd
oauthScopes:
- "https://www.googleapis.com/auth/devstorage.read_only"
- "https://www.googleapis.com/auth/compute"
upgradeSettings:
maxSurge: 1
maxUnavailable: 1
initialNodeCount: 3
locations:
- europe-west3-a
You can then see the NodePool
resource starting to try to issue update requests. I monitored that behaviour with looking into the audit logs of the used service account.
What environment did it happen in?
Crossplane version: v0.21.0
Thank you for your help and please let me know if I need to provide any more insides into the issue!
Is there a fix for this?
Is there a fix for this?
Unfortunately not. As a workaround I made sure to change my deletion policy to rentain and deleted the NodePool Ressource after everything was created. This is obviously failing to achieve the benefit of having it as a CRD so I would be happy to see if this issue could be resolved somehow. I also didn't try again with newer version of crossplane, but it might still be the case that the described behavior occurs.