gitops-operator icon indicating copy to clipboard operation
gitops-operator copied to clipboard

Fix bug where ArgoCD removes nodePlacement stanza from configuration

Open Rizwana777 opened this issue 1 year ago • 14 comments

What type of PR is this? /kind bug

JIRA - https://issues.redhat.com/browse/GITOPS-4185

We were not able to patch the NodePlacement directly through ArgoCD CR, if we do, it removes the nodePlacement from ArgoCD CR while reconciling because of this https://github.com/redhat-developer/gitops-operator/blob/master/controllers/gitopsservice_controller.go#L475 here defaultArgoCDInstance.Spec.NodePlacement is nil as it not getting updated anywhere else in code, defaultArgoCDInstance.Spec.NodePlacement is only getting updated if we update it through GitOpsService CR or else it update existingArgoCD.Spec.NodePlacement as nil because of the reason I have mentioned above , but if a user wants to add custom nodePlacement he/she should patch nodePlacement through GitOpsService CR and it is working as expected I see no bug in code and code is valid https://github.com/redhat-developer/gitops-operator/blob/01251d23020555e3a28d4a69e30718e9d5f0eb4d/controllers/gitopsservice_controller.go#L392

I have updated the code here - https://github.com/redhat-developer/gitops-operator/blob/01251d23020555e3a28d4a69e30718e9d5f0eb4d/controllers/gitopsservice_controller.go#L475 which is checking for nil , the logic mentioned in the PR states that , it will check for defaultArgoCDInstance.Spec.NodePlacement not equal to nil (it means nodePlacePlacement is updated through GitOps service CR and the values got updated here in if condition) else it will not enter the if condition to check if defaultArgoCDInstance.Spec.NodePlacement is nil (it means nodePlacePlacement is updated through ArgoCD by user and it should not replace the existingArgoCD with nil value, that is the changes made by user will not get reverted back as mentioned in the bug when it reconciles)

This is what I have came up with for the bug mentioned above, please do let me know your suggestions, thank you

Rizwana777 avatar Jul 03 '24 11:07 Rizwana777

/retest

Rizwana777 avatar Jul 04 '24 06:07 Rizwana777

/retest

jgwest avatar Jul 07 '24 00:07 jgwest

Also, did you verify why the users are unable to patch the resource via the GitopService CR? why does that not add the required nodeSelectors, maybe there is another bug in that.

saumeya avatar Jul 08 '24 09:07 saumeya

Also, did you verify why the users are unable to patch the resource via the GitopService CR? why does that not add the required nodeSelectors, maybe there is another bug in that.

Yes I tried reproducing both the bugs , I reproduced one of the bug where user is trying to patch using argocd (oc patch argocd) and the second one((oc patch gitopsservice) I did not see this bug, I tried every steps to reproduce it, but not seeing this bug and I am able to patch argocd via GitopsService CR

Rizwana777 avatar Jul 08 '24 10:07 Rizwana777

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign svghadi for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci[bot] avatar Jul 18 '24 07:07 openshift-ci[bot]

/retest

Rizwana777 avatar Jul 18 '24 12:07 Rizwana777

/retest

Rizwana777 avatar Jul 25 '24 08:07 Rizwana777

LGTM, rebasing to latest to see if that fixes the test failures

jgwest avatar Jul 25 '24 22:07 jgwest

/retest

jgwest avatar Aug 01 '24 13:08 jgwest

/test v4.12-kuttl-sequential

Rizwana777 avatar Aug 01 '24 15:08 Rizwana777

/retest

Rizwana777 avatar Aug 01 '24 15:08 Rizwana777

CI issue?

/retest

jgwest avatar Aug 02 '24 06:08 jgwest

Hi @jgwest, there are four sequential test failures that are recurring for this PR so I don't think re trigger would help. Retesting without proper investigation leads to more resource consumption and at the moment we are trying to minimise our cloud cost. I have suggested @Rizwana777 to take a look at the tests that are failing and run them individually with her change to debug further, maybe some tests need updates?

1-071_validate_node_selectors 1-028_validate_run_on_infra 1-020_validate_redis_ha_nonha (might not be related but worth checking since it fails for this PR only) 1-035_validate_argocd_secret_repopulate (might not be related but worth checking since it fails for this PR only)

Sequential test history: https://prow.ci.openshift.org/job-history/gs/test-platform-results/pr-logs/directory/pull-ci-redhat-developer-gitops-operator-master-v4.12-kuttl-sequential https://prow.ci.openshift.org/job-history/gs/test-platform-results/pr-logs/directory/pull-ci-redhat-developer-gitops-operator-master-v4.13-kuttl-sequential https://prow.ci.openshift.org/job-history/gs/test-platform-results/pr-logs/directory/pull-ci-redhat-developer-gitops-operator-master-v4.14-kuttl-sequential

varshab1210 avatar Aug 02 '24 07:08 varshab1210