argo-cd icon indicating copy to clipboard operation
argo-cd copied to clipboard

feat: Application dependencies

Open jannfis opened this issue 1 year ago • 24 comments

Closes https://github.com/argoproj/argo-cd/issues/7437

There's some docs within this PR which describe the functionality and configuration.

Checklist:

  • [ ] Either (a) I've created an enhancement proposal and discussed it with the community, (b) this is a bug fix, or (c) this does not need to be in the release notes.
  • [ ] The title of the PR states what changed and the related issues number (used for the release note).
  • [ ] The title of the PR conforms to the Toolchain Guide
  • [ ] I've included "Closes [ISSUE #]" or "Fixes [ISSUE #]" in the description to automatically close the associated issue.
  • [ ] I've updated both the CLI and UI to expose my feature, or I plan to submit a second PR with them.
  • [ ] Does this PR require documentation updates?
  • [ ] I've updated documentation as required by this PR.
  • [ ] Optional. My organization is added to USERS.md.
  • [ ] I have signed off all my commits as required by DCO
  • [ ] I have written unit and/or e2e tests for my change. PRs without these are unlikely to be merged.
  • [ ] My build is green (troubleshooting builds).
  • [ ] My new feature complies with the feature status guidelines.
  • [ ] I have added a brief description of why this PR is necessary and/or what this PR solves.

jannfis avatar Aug 29 '23 17:08 jannfis

Codecov Report

Attention: Patch coverage is 60.00000% with 118 lines in your changes are missing coverage. Please review.

Project coverage is 50.07%. Comparing base (f87897c) to head (7513255). Report is 100 commits behind head on master.

:exclamation: Current head 7513255 differs from pull request most recent head b388917. Consider uploading reports for the commit b388917 to get more accurate results

Files Patch % Lines
controller/appcontroller.go 36.36% 52 Missing and 4 partials :warning:
controller/sync.go 76.78% 20 Missing and 6 partials :warning:
controller/dependencies.go 70.96% 12 Missing and 6 partials :warning:
pkg/apis/application/v1alpha1/types.go 41.93% 17 Missing and 1 partial :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #15280      +/-   ##
==========================================
+ Coverage   49.73%   50.07%   +0.34%     
==========================================
  Files         274      267       -7     
  Lines       48948    46256    -2692     
==========================================
- Hits        24343    23165    -1178     
+ Misses      22230    20825    -1405     
+ Partials     2375     2266     -109     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Aug 29 '23 20:08 codecov[bot]

Thanks for working on this! Is there a public test image I can use to try it out?

LS80 avatar Sep 20 '23 11:09 LS80

Is there a public test image I can use to try it out?

I'm afraid there's not, sorry. We usually don't build images for PRs at this stage. But you can easily build your own, by checking out this PR and running make image (needs docker or podman with docker emulation to work)

jannfis avatar Sep 20 '23 16:09 jannfis

I've tried this and the functionality works great and as advertised, i did however noticed the following:

After i switched to the image and CRDs i built from this branch the CPU usage went up quite a bit for the application controller:

Screenshot from 2023-10-29 17-25-47 Each line represents half a cpu core. The first increase to 1.5 cores is when i switched from ArgoCD built from the main branch to this. Initially i had nothing deployed but after adding two applications the CPU usage went up quite a bit further, each application contained a single deployment resource.

There are no errors in the log for the controller or anything else that seems useful.

ellakk avatar Oct 29 '23 16:10 ellakk

Thanks, that's some valuable feedback @ellakk! I gotta dig into this, it's probably the dependency graph being build too often.

jannfis avatar Nov 01 '23 23:11 jannfis

Thank you for working on this!

@jannfis is this supposed to be working with applicationSets?

riuvshyn avatar Dec 11 '23 09:12 riuvshyn

@jannfis Would this work with apps of application-sets, the gitops-bridge and other end users use this pattern to bootstrap a new cluster with addons/plugins/operators in a specific order. Currently there is no status to indicate that the all child apps of an application set are healthy to move on to the next application set.

csantanapr avatar Feb 02 '24 05:02 csantanapr

@ellakk it would be helpful if there is some way to minimally reproduce your issue. Is there a way for you to supply us with an App of Apps or similar which reproduces the issue (suitably redacted if required)?

I'm really looking forward to this PR and would love to see this pushed over the finish line!

blakepettersson avatar Feb 12 '24 12:02 blakepettersson

@ellakk it would be helpful if there is some way to minimally reproduce your issue. Is there a way for you to supply us with an App of Apps or similar which reproduces the issue (suitably redacted if required)?

I'm really looking forward to this PR and would love to see this pushed over the finish line!

Hi, sorry for the late reply. As I wrote in my initial post, I had nothing deployed (and no gitrepos added either). All i did was build the containers from what is currently the HEAD of this branch, applied the CRDs and deployed the containers with the manifests from this branch. The CPU usage for the controller was about 40x of what i normally get from an empty ArgoCD deployment. The CPU usage then went up even further when i added two applications where one depended on the other, but the problem with the CPU was already apparent.

I can try to recreate the problem if the developer or other people cant. I no longer have the exact manifests I used but if I have to redeploy this build I will also post the manifests.

What i think would be more helpful is if someone who is also interested in this feature built and deployed ArgoCD from the head of this branch and shared their experience.

ellakk avatar Feb 28 '24 23:02 ellakk

Sorry for being so late to the recent comments.

@riuvshyn I haven't tested it with ApplicationSet, but ff you define one or more dependency selectors in an ApplicationSet's Application template, I do not see a reason why it would not work.

@csantanapr The code uses an Application's sync and health state to determine whether to proceed in the dependency chain. So it all depends on that information being exposed by your Application.

jannfis avatar Mar 08 '24 00:03 jannfis

Just built and deployed with the latest commit from this branch and the performance issues i had are gone (both on OpenShift and K3S). Thanks a lot for the work you are putting into this @jannfis!

ellakk avatar Mar 14 '24 16:03 ellakk

Thank you so much for testing again, @ellakk. Actually, I have not changed anything I thought was performance related, so potentially the iteration of the PR you tested was against a state of the master branch that had this problem.

Anyhow, thanks for confirming that you don't see that performance issue anymore. Please let me know if there's something else you notice during testing.

jannfis avatar Mar 14 '24 16:03 jannfis

Just wanted to say I’m so excited to see this merged soon. Should reduce my deployment for a new cluster from 10-15 steps to 3 or so! Thanks for all this awesome work @jannfis

mattbrandman avatar Mar 17 '24 17:03 mattbrandman

I think this is good to go for the first iteration. The only remaining question imho would be: Should this feature be behind a feature-flag or not?

cc @crenshaw-dev @alexmt @pasha-codefresh

jannfis avatar Mar 19 '24 14:03 jannfis

It's certainly non-trivial. But I'd want to play with it, and my default playground is https://cd.apps.argoproj.io/applications so if the feature flag wasn't enabled there, I wouldn't play with it ...

jsoref avatar Mar 19 '24 15:03 jsoref

@jsoref Thanks for that thought. But I was wondering, how would you play with that feature on a read-only public instance?

jannfis avatar Mar 19 '24 15:03 jannfis

I'd expect to be able to see all the screens and configs w/o making changes--Ideally it's enough to get a sense of the actual layouts and interactions...

jsoref avatar Mar 19 '24 16:03 jsoref

Should this feature be behind a feature-flag or not?

IMO it should not, but that's just me 🙏

blakepettersson avatar Mar 28 '24 12:03 blakepettersson

@CodiumAI-Agent /review

jannfis avatar Mar 30 '24 00:03 jannfis

PR Review

⏱️ Estimated effort to review [1-5]

5, because the PR introduces a significant new feature with a wide impact across multiple files, including changes to the core data models, addition of new functionalities, and updates to the documentation. The complexity and size of the changes require a thorough review to ensure quality and maintainability.

🧪 Relevant tests

Yes

🔍 Possible issues

Possible Bug: The implementation assumes that applications can only depend on other applications within the same namespace and project. This might not always be desirable or sufficient for all use cases.

Performance Concern: The recursive dependency resolution could potentially lead to performance issues for applications with a large number of dependencies or deeply nested dependency trees.

🔒 Security concerns

No

Code feedback:
relevant filecontroller/appcontroller.go
suggestion      

Consider implementing a mechanism to prevent infinite recursion or stack overflow in refreshDependencies. This could happen if there's a circular dependency between applications. A simple way to mitigate this is by keeping track of already visited applications during the recursion and stopping if a cycle is detected. [important]

relevant linefunc (ctrl *ApplicationController) refreshDependencies(app *v1alpha1.Application, state *v1alpha1.OperationState) {

relevant filecontroller/dependencies.go
suggestion      

To improve the efficiency of dependency resolution, consider caching the results of dependency lookups. This can significantly reduce the load on the Kubernetes API server in clusters with a large number of applications. [medium]

relevant linefunc (mgr *appStateManager) buildDependencyGraph(app *v1alpha1.Application, depGraph *topsort.Graph) error {

relevant filepkg/apis/application/v1alpha1/types.go
suggestion      

For better usability and future extensibility, consider adding validation for the ApplicationDependency struct. This could include checks like ensuring syncDelay and timeout are positive values and that selectors are not empty when blockOnEmpty is true. [medium]

relevant linetype ApplicationDependency struct {

relevant filedocs/user-guide/app_dependencies.md
suggestion      

Add a section on troubleshooting common issues with application dependencies, such as circular dependencies or misconfigured selectors. This can help users diagnose and resolve issues more quickly. [medium]

relevant line## Configuring Application dependencies


✨ Review tool usage guide:

Overview: The review tool scans the PR code changes, and generates a PR review which includes several types of feedbacks, such as possible PR issues, security threats and relevant test in the PR. More feedbacks can be added by configuring the tool.

The tool can be triggered automatically every time a new PR is opened, or can be invoked manually by commenting on any PR.

  • When commenting, to edit configurations related to the review tool (pr_reviewer section), use the following template:
/review --pr_reviewer.some_config1=... --pr_reviewer.some_config2=...
[pr_reviewer]
some_config1=...
some_config2=...

See the review usage page for a comprehensive guide on using this tool.

CodiumAI-Agent avatar Mar 30 '24 00:03 CodiumAI-Agent

Hi all curious if there is a general timeline on this being merged as it looks like its passing all the tests?

mattbrandman avatar Apr 12 '24 01:04 mattbrandman

This is an absolutely essential feature for my business as well. Without this spinning up new environments with ArgoCD is nearly impossible as it requires knowledge of which apps are dependent on which apps to apply Application(s) in that order and let them sync as you apply them.

yinzara avatar Apr 19 '24 16:04 yinzara

Added some thoughts.

I'm generally uncomfortable adding another ordering mechanism to Argo CD. I think that we risk moving folks away from "eventual consistency" models of deployment in k8s back to an imperative CI-type model, just built in Argo CD instead of in Jenkins. I think if we push forward with this feature, we should write documentation outlining use cases where this tool makes sense as well as a list of alternatives which could provide a better experience (controllers that enforce ordering, validating webhooks that block deployment until prerequisites are met, readiness checks which prevent apps from spinning up too soon, etc.).

It looks like you never have head a complex apps with dynamic previews. For now we have a previews via appsets and they includes a DB, redis, etc setup. Setting of different argocd.argoproj.io/sync-wave just not working at all - as for me its not track such value withing appsets. So only way - define dependencies.

simonoff avatar Apr 28 '24 17:04 simonoff

Added some thoughts.

I'm generally uncomfortable adding another ordering mechanism to Argo CD. I think that we risk moving folks away from "eventual consistency" models of deployment in k8s back to an imperative CI-type model, just built in Argo CD instead of in Jenkins. I think if we push forward with this feature, we should write documentation outlining use cases where this tool makes sense as well as a list of alternatives which could provide a better experience (controllers that enforce ordering, validating webhooks that block deployment until prerequisites are met, readiness checks which prevent apps from spinning up too soon, etc.).

I generally agree that eventual consistency should be the default way to go (and I'm sure @vfarcic would approve). I see this PR as a stepping stone to be able to enforce a specific order of deletion, which IMO is something necessary - perhaps we can get some inspiration from crossplane/crossplane#3393.

Otherwise I'm 💯 with what you're saying.

blakepettersson avatar Apr 29 '24 11:04 blakepettersson