provider-gcp icon indicating copy to clipboard operation
provider-gcp copied to clipboard

Replica Names list for primary instance doesn't populate correctly

Open BigSammich opened this issue 3 years ago • 3 comments

What happened?

Scenario 1: CloudSQLInstance resources that have more than one replica created don't have their ReplicaNames list populated correctly.

Below image shows cloudsql instance response from GCP API image

Below image shows CloudSQLInstance resource that provider-gcp is managing image

Expected behavior: CloudSQLInstance resource should accurately reflect the list of associated replicas (in this case there should be two replicas)

Scenario 2: CloudSQLInstance resource still has an assigned replica even after the replica was deleted or promoted.

Expected behavior: CloudSQLInstance resource should accurately reflect the list of associated replicas (in this case there should be no replicas)

Additional impact: This causes the CloudSQL instance to be in a constant update state because of the replica mismatch.

How can we reproduce it?

Scenario 1:

  1. create a CloudSQLInstance resource
  2. create two replicas of the above primary CloudSQLInstance resource
  3. observe that the ReplicaNames for the CloudSQLInstance resource only contains the first replica that was created
  4. also observe that provider-gcp continually tries to update the primary instance to match the (incorrect) replica list

Scenario 2:

  1. create a CloudSQLInstance resource
  2. create one replica of the above primary CloudSQLInstance resource
  3. promote or delete the replica
  4. observe that the ReplicaNames for the CloudSQLInstance resource still contains the replica even though the GCP API will report that the replica list is empty
  5. also observe that provider-gcp continually tries to update the primary instance to match the (incorrect) replica list

What environment did it happen in?

Crossplane version: 1.3.1-up.1 provider-gcp: 0.18

BigSammich avatar Oct 08 '21 21:10 BigSammich

@BigSammich In step 2 of scenario 1, and step 3 of scenario 2, how are you creating/promoting replicas? If you're doing so outside of Crossplane the behaviour you're seeing is expected; the spec of the Crossplane CloudSQLInstance is considered to be 'authoritative' and won't be updated to reflect changes made outside of Crossplane - instead Crossplane will try to undo those changes.

negz avatar Nov 30 '21 23:11 negz

@negz replica CREATION (scenario 1/2, step 2) was done natively by applying a replica claim and having Crossplane (provider-gcp) create the instance.

Replica PROMOTION (scenario 2, step 3) was done outside of Crossplane (via the GCP console), because there is no replica promotion functionality within provider-gcp that I'm aware of. I guess this brings up my (incorrect) assumption that if functionality doesn't exist within Crossplane (provider-gcp), then the operator would not be authoritative for that area/field/use case. Please let me know if I'm incorrect and replica promotion can be done within Crossplane (provider-gcp).

Replica DELETION (scenario 2, step 3) was done natively by deleting the replica claim.

BigSammich avatar Dec 05 '21 14:12 BigSammich

I have worked with those scenarios a bit.

  • I would say, since those resources are created by crossplane, thus the spec is the source of truth, as @negz mentioned it will try to update the instance! But GCP API doesn't accept it so in every reconciliation loop got uptodate: false and will try Update().

Therefore, Scenario 1 @BigSammich you should definitely adjust your spec , it's not safe that the controller could handle that.

(But I'm guessing to ignore replicasNames in the go-cmp diff based on some condition. )

something like this: a new field ?

spec:
  ignoreDiff:
  - spec.forProvider.replicaNames

Same applies for Scenario 1.

mcbenjemaa avatar Jan 12 '22 17:01 mcbenjemaa