Multistage deployments with ArgoCD-Trigger / -Health on decentralised and air-gapped cluster installations
Checklist
- [x] I've searched the issue queue to verify this is not a duplicate feature request.
- [ ] I've pasted the output of
kargo version, if applicable. - [ ] I've pasted logs, if applicable.
Dear community, for us Kargo is a perfect match because we a implemented the concepts of staging & promotion via ArgoEvents / Workflow by our self but this ended up to be very intransparent and also not flexible enough for our DEVs.
As a highly regulated enterprise we running ephemeral and air-gapped cluster where each cluster has its own ArgoCD instance that just pulls the relevant manifests and stage configurations from GitHub Enterprise. So stages are distributed between clusters that are not aware of each other. The whole orchestration is done via GitOps and the mentioned pipelines.
Proposed Feature
With Kargo we would gain the full transparency if we model promotion end2end but here is the issue: A central Kargo instance can not access the ArgoCD health of other clusters and can not trigger sync.
Motivation
By design and compliance reasons we run all our clusters decentralised fully via GitOps. To still have the (at the moment missing) full transparency of the CD promotion including the great ArgoCD integration features we would need to access the remote ArgoCD app status.
Suggested Implementation
For this to work we would need a controller on each cluster that is able to communicate with the central Kargo installation. So there would be the need to be able to configure a new stage.promotionMechanism like e.g. argoCDRemoteAppUpdate where we would configure the app config and additionally the remote controller endpoint.
The model you've described is almost exactly what we already follow -- controllers may be distributed and communicate with a "nearby" Argo CD control plane, but a centralized Kargo control plane. It's all "phone home," and never the other way around.
There is, however, no need to configure stages to know where the relevant Argo CD is. Stages can be labeled as belonging to a "shard" and they will be reconciled only by the corresponding Kargo Controller, which already knows how to talk to its "nearby" Argo CD control plane.
In short, you'll have multiple Kargo controllers, each of which is in community with the Kargo control plane and a Argo CD control plane.
@krancour I am so flashed to read that. After reading you comment I had a looked at the helm charts again and found api.argocd.urls and controller.shardName parameter. Is this what we need to configure? Which label do we need to put on the stage to make this working? If this really is implemented already I would like to contribute to an an "advanced deployment tutorial". Guys you should show what this product is capable of! WOW! 💪
@MarkusNeuron thank you for the kind words. I missed the caveat... that's how we built, but no one has tested extensively with this topology yet.
Which label do we need to put on the stage to make this working?
kargo.akuity.io/shard: <shard name>
Also note:
-
Promotions will automatically be assigned to the same shard as whatever Stage they reference.
-
Freight is never sharded because it needs to be visible to all controllers
-
You will either need to run a centralized controller to handle Warehouses OR put a shard label on your Warehouses if their subscriptions will only work from behind your firewall.
Will test this setup in the next days and report back. Thx again!
Hi @krancour
I tested the sharded topology, and wrote the following guide. Please review it and let me know if something needs to be corrected, because my understanding/assumption was not correct. If you think I should move this guide to GitHub Discussion, let me know, it might be useful for other users who might want to experiment with the feature.
I aligned with @MarkusNeuron and we have the following follow-up questions:
- Is it the only way to make sharded topology model work by configuring k8s clients (kubeconfigs) and therefore exposing the kube-api-server on the cluster where the central ("management") Kargo is running?
- Would it be an option to just expose the kargo-controller endpoint of the central ("management") controller instead (and this controller would query/create/update the local kube-api-server and communicates to the distributed controller the Kargo-related custom resources) of exposing the whole kube-api-server of the central cluster? IMHO this would fit more to an air-gapped cluster concept.
Kargo Sharded Topology Guide
Setup environments
Prerequisites:
- Rancher Desktop
- kind
- kubectx (brew install kubectx)
- Docker registry proxy - in this guide we will refer to it as docker.example.com
Create two new kind clusters:
- central-mgmt (will be Kargo control plane)
- distributed (will "phone-home" to central-mgmt)
kind create cluster \
--wait 120s \
--config - <<EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: central-mgmt
nodes:
- extraPortMappings:
- containerPort: 31443 # Argo CD dashboard
hostPort: 31443
- containerPort: 31444 # Kargo dashboard
hostPort: 31444
- containerPort: 30081 # test application instance
hostPort: 30081
- containerPort: 30082 # UAT application instance
hostPort: 30082
- containerPort: 30083 # prod application instance
hostPort: 30083
EOF
kind create cluster \
--wait 120s \
--config - <<EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: distributed
nodes:
- extraPortMappings:
- containerPort: 31445 # Argo CD dashboard
hostPort: 31445
- containerPort: 31446 # Kargo dashboard
hostPort: 31446
- containerPort: 30181 # test application instance
hostPort: 30181
- containerPort: 30182 # UAT application instance
hostPort: 30182
- containerPort: 30183 # prod application instance
hostPort: 30183
EOF
Once clusters are ready, you can change context in between them using kubectx
Deploy Helm charts
cert-manager
Change context to central-mgmt cluster:
kubectx kind-central-mgmt
helm install cert-manager cert-manager --repo https://charts.jetstack.io --version 1.11.5 --namespace cert-manager --create-namespace --set installCRDs=true --set image.repository=docker.example.com/jetstack/cert-manager-controller --set cainjector.image.repository=docker.example.com/jetstack/cert-manager-cainjector --set webhook.image.repository=docker.example.com/jetstack/cert-manager-webhook --set startupapicheck.image.repository=docker.example.com/jetstack/cert-manager-ctl --wait
Change context to distributed cluster:
kubectx kind-distributed
Repeat the previous Helm command to install cert-manager on distributed cluster as well.
ArgoCD
Install the chart first on central-mgmt cluster, use NodePort=31443
Change context to central-mgmt cluster:
kubectx kind-central-mgmt
helm upgrade --install argocd argo-cd --repo https://argoproj.github.io/argo-helm --version 5.51.6 --namespace argocd --create-namespace --set 'configs.secret.argocdServerAdminPassword=$2a$10$5vm8wXaSdbuff0m9l21JdevzXBzJFPCi8sy6OOnpZMAG.fOXL7jvO' --set dex.enabled=false --set notifications.enabled=false --set server.service.type=NodePort --set server.service.nodePortHttp=31443 --set server.extensions.enabled=true --set 'server.extensions.contents[0].name=argo-rollouts' --set 'server.extensions.contents[0].url=https://github.com/argoproj-labs/rollout-extension/releases/download/v0.3.3/extension.tar' --set global.image.repository=docker.example.com/argoproj/argocd --set redis.image.repository=docker.example.com/docker/library/redis --set server.extensions.image.repository=docker.example.com/argoproj-labs/argocd-extensions --wait
Secondly, install the chart on distributed cluster, use NodePort=31445
Change context to distributed cluster:
kubectx kind-distributed
helm upgrade --install argocd argo-cd --repo https://argoproj.github.io/argo-helm --version 5.51.6 --namespace argocd --create-namespace --set 'configs.secret.argocdServerAdminPassword=$2a$10$5vm8wXaSdbuff0m9l21JdevzXBzJFPCi8sy6OOnpZMAG.fOXL7jvO' --set dex.enabled=false --set notifications.enabled=false --set server.service.type=NodePort --set server.service.nodePortHttp=31445 --set server.extensions.enabled=true --set 'server.extensions.contents[0].name=argo-rollouts' --set 'server.extensions.contents[0].url=https://github.com/argoproj-labs/rollout-extension/releases/download/v0.3.3/extension.tar' --set global.image.repository=docker.example.com/argoproj/argocd --set redis.image.repository=docker.example.com/docker/library/redis --set server.extensions.image.repository=docker.example.com/argoproj-labs/argocd-extensions --wait
Argo Rollouts
Install on both clusters Argo Rollouts.
Change context to central-mgmt cluster:
kubectx kind-central-mgmt
helm upgrade --install argo-rollouts argo-rollouts --repo https://argoproj.github.io/argo-helm --version 2.33.0 --create-namespace --namespace argo-rollouts --set controller.image.registry=docker.example.com --set controller.image.repository=argoproj/argo-rollouts --wait
Change context to distributed cluster:
kubectx kind-distributed
Repeat the previous Helm command to install Argo Rollouts on distributed cluster as well.
Kargo
central-mgmt
Change context to central-mgmt cluster:
kubectx kind-central-mgmt
Set api.service.nodePort=31444
Set controller.shardName=central-mgmt
helm upgrade --install kargo oci://ghcr.io/akuity/kargo-charts/kargo --namespace kargo --create-namespace --set api.service.type=NodePort --set api.service.nodePort=31444 --set api.adminAccount.password=admin --set api.adminAccount.tokenSigningKey=iwishtowashmyirishwristwatch --set image.repository=docker.example.com/akuity/kargo --set controller.shardName=central-mgmt --wait
distributed
Change context to distributed cluster:
kubectx kind-distributed
Set api.service.nodePort=31446
Set controller.shardName=distributed
Set api.argocd.urls mapping to point to https://argocd-server.argocd.svc - this is the ArgoCD running next to Kargo on distributed cluster
Prepare the kubeconfig which Kargo will use to connect to central-mgmt cluster:
-
Copy ~/.kube/config to ~/kubeconfig.yaml
cp ~/.kube/config ~/kubeconfig.yaml -
Edit ~/kubeconfig.yaml and keep only the central-mgmt cluster relevant entries. Make sure current-context is set to kind-central-mgmt. It should contain similar like this:
apiVersion: v1 clusters: - cluster: certificate-authority-data: ... server: https://127.0.0.1:53113 name: kind-central-mgmt contexts: - context: cluster: kind-central-mgmt user: kind-central-mgmt name: kind-central-mgmt current-context: kind-central-mgmt kind: Config preferences: {} users: - name: kind-central-mgmt user: client-certificate-data: ... client-key-data: ... -
Get the IP address of the container that runs central-mgmt cluster with the following command:
docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' central-mgmt-control-plane -
Set in ~/kubeconfig.yaml the IP address under server key. This key is nested under
clusterwhich is in turn nested under the- cluster:item in theclusterslist. Very likely currently it has the valuehttps://127.0.0.1:<someport>. You have to change ithttps://<ip_address_from_step_3>:6443For example: https://172.18.0.2:6443
-
Create kargo namespace
kubectl create namespace kargo -
Create a secret with the following command
kubectl create secret generic central-mgmt-kubeconfig --from-file=kubeconfig.yaml -n kargo
Once the secret is created, prepare values.yaml for the Helm chart installation with a file editor, e.g. with vim values.yaml
values.yaml should contain:
api:
service:
type: NodePort
nodePort: 31446
adminAccount:
password: admin
tokenSigningKey: iwishtowashmyirishwristwatch
argocd:
urls:
"distributed": https://argocd-server.argocd.svc
image:
repository: docker.example.com/akuity/kargo
kubeconfigSecrets:
kargo: central-mgmt-kubeconfig
controller:
shardName: distributed
Finally, deploy the Helm chart:
helm upgrade --install kargo oci://ghcr.io/akuity/kargo-charts/kargo --namespace kargo --create-namespace -f values.yaml --wait
Create ArgoCD applications
central-mgmt
Change context to central-mgmt cluster:
kubectx kind-central-mgmt
cat <<EOF | kubectl apply -f -
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: kargo-demo
namespace: argocd
spec:
generators:
- list:
elements:
- stage: test
template:
metadata:
name: kargo-demo-{{stage}}
annotations:
kargo.akuity.io/authorized-stage: kargo-demo:{{stage}}
spec:
project: default
source:
repoURL: ${GITOPS_REPO_URL}
targetRevision: stage/{{stage}}
path: stages/{{stage}}
destination:
server: https://kubernetes.default.svc
namespace: kargo-demo-{{stage}}
syncPolicy:
syncOptions:
- CreateNamespace=true
EOF
distributed
Change context to distributed cluster:
kubectx kind-distributed
cat <<EOF | kubectl apply -f -
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: kargo-demo
namespace: argocd
spec:
generators:
- list:
elements:
- stage: uat
- stage: prod
template:
metadata:
name: kargo-demo-{{stage}}
annotations:
kargo.akuity.io/authorized-stage: kargo-demo:{{stage}}
spec:
project: default
source:
repoURL: ${GITOPS_REPO_URL}
targetRevision: stage/{{stage}}
path: stages/{{stage}}
destination:
server: https://kubernetes.default.svc
namespace: kargo-demo-{{stage}}
syncPolicy:
syncOptions:
- CreateNamespace=true
EOF
Deploy modified Kargo Quickstart resources
We are going to re-use the Kargo resources from Kargo Quickstart guide, the only adaptations we have to do are the following:
- set kargo.akuity.io/shard label on
Warehouseand onStageresources - change images to point to our internal Artifactory repository docker.example.com
We model that our test stage is on central-mgmt cluster, and uat and prod stages are on distributed cluster.
-
Do the steps as per Kargo Quickstart guide (section Create a GitOps Repository section only), and fork the kago-demo repo. Also make sure that your GITOPS_REPO_URL variable is set.
-
In the forked repo of your own, change base/deploy.yaml line number 17 the image and change in the image value from nginx:placeholder to docker.example.com/nginx/nginx:placeholder
-
Save your GitHub handle and your personal access token in environment variables:
export GITHUB_USERNAME=<your github handle> export GITHUB_PAT=<your personal access token>
Change context to central-mgmt cluster:
kubectx kind-central-mgmt
Run the following command:
cat <<EOF | kubectl apply -f -
apiVersion: kargo.akuity.io/v1alpha1
kind: Project
metadata:
name: kargo-demo
---
apiVersion: v1
kind: Secret
type: Opaque
metadata:
name: kargo-demo-repo
namespace: kargo-demo
labels:
kargo.akuity.io/secret-type: repository
stringData:
type: git
url: ${GITOPS_REPO_URL}
username: ${GITHUB_USERNAME}
password: ${GITHUB_PAT}
---
apiVersion: kargo.akuity.io/v1alpha1
kind: Warehouse
metadata:
name: kargo-demo
namespace: kargo-demo
labels:
kargo.akuity.io/shard: central-mgmt
spec:
subscriptions:
- image:
repoURL: docker.example.com/nginx/nginx
semverConstraint: ^1.25.0
---
apiVersion: kargo.akuity.io/v1alpha1
kind: Stage
metadata:
name: test
namespace: kargo-demo
labels:
kargo.akuity.io/shard: central-mgmt
spec:
subscriptions:
warehouse: kargo-demo
promotionMechanisms:
gitRepoUpdates:
- repoURL: ${GITOPS_REPO_URL}
writeBranch: stage/test
kustomize:
images:
- image: docker.example.com/nginx/nginx
path: stages/test
argoCDAppUpdates:
- appName: kargo-demo-test
appNamespace: argocd
---
apiVersion: kargo.akuity.io/v1alpha1
kind: Stage
metadata:
name: uat
namespace: kargo-demo
labels:
kargo.akuity.io/shard: distributed
spec:
subscriptions:
upstreamStages:
- name: test
promotionMechanisms:
gitRepoUpdates:
- repoURL: ${GITOPS_REPO_URL}
writeBranch: stage/uat
kustomize:
images:
- image: docker.example.com/nginx/nginx
path: stages/uat
argoCDAppUpdates:
- appName: kargo-demo-uat
appNamespace: argocd
---
apiVersion: kargo.akuity.io/v1alpha1
kind: Stage
metadata:
name: prod
namespace: kargo-demo
labels:
kargo.akuity.io/shard: distributed
spec:
subscriptions:
upstreamStages:
- name: uat
promotionMechanisms:
gitRepoUpdates:
- repoURL: ${GITOPS_REPO_URL}
writeBranch: stage/prod
kustomize:
images:
- image: docker.example.com/nginx/nginx
path: stages/prod
argoCDAppUpdates:
- appName: kargo-demo-prod
appNamespace: argocd
EOF
Verify the hypothesis:
- test stage is assigned to central-mgmt cluster
- uat and prod stage is assigned to distributed cluster - when promotion to uat and prod stage is successful, Kargo controller running on distributed cluster is triggered, and ArgoCD running on distributed cluster is doing the application sync, and the application is synced on that cluster
- both UI of central-mgmt cluster and distributed cluster are showing the same pipeline
AnalysisTemplates
Change context to central-mgmt cluster:
kubectx kind-central-mgmt
Create AnalysisTemplate
cat <<EOF | kubectl apply -f -
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: kargo-demo-analysistemplate-uat
namespace: kargo-demo
spec:
metrics:
- name: fail-or-pass
#count: 1
#interval: 5s
#failureLimit: 1
provider:
job:
spec:
template:
spec:
containers:
- name: sleep
image: docker.example.com/alpine:latest
command: [sh, -c]
args:
- exit {{args.exit-code}}
restartPolicy: Never
backoffLimit: 1
EOF
Modify the uat stage, and add verification to the spec:
cat <<EOF | kubectl apply -f -
apiVersion: kargo.akuity.io/v1alpha1
kind: Stage
metadata:
name: uat
namespace: kargo-demo
labels:
kargo.akuity.io/shard: distributed
spec:
subscriptions:
upstreamStages:
- name: test
promotionMechanisms:
gitRepoUpdates:
- repoURL: ${GITOPS_REPO_URL}
writeBranch: stage/uat
kustomize:
images:
- image: docker.example.com/nginx/nginx
path: stages/uat
argoCDAppUpdates:
- appName: kargo-demo-uat
appNamespace: argocd
verification:
analysisTemplates:
- name: kargo-demo-analysistemplate-uat
analysisRunMetadata:
labels:
app: kargo-demo-analysistemplate-uat
annotations:
foo: bar
args:
- name: exit-code # no CamelCaseAllowed!
value: "0"
EOF
Modify Warehouse, and add new image subscription. In my example this is docker2.example.com/some/new/dummy/repo/image with semverConstraint ^2024.0.0
cat <<EOF | kubectl apply -f -
apiVersion: kargo.akuity.io/v1alpha1
kind: Warehouse
metadata:
name: kargo-demo
namespace: kargo-demo
labels:
kargo.akuity.io/shard: central-mgmt
spec:
subscriptions:
- image:
repoURL: docker.example.com/nginx/nginx
semverConstraint: ^1.25.0
- image:
repoURL: docker2.example.com/some/new/dummy/repo/image
semverConstraint: ^2024.0.0
EOF
Change context to distributed cluster:
kubectx kind-distributed
Create namespace kargo-demo - this is needed, because there is no kargo-demo namespace yet on distributed cluster, and AnalysisRun will be triggered in that namespace:
kubectl create namespace kargo-demo
Make sure that the new Freight appeared, then promote that new Freight first to test, and after that to uat stage. In the uat stage, AnalysisRun should be triggered.
Change back to central-mgmt cluster and check the stage field:
kubectx kind-central-mgmt
kubectl get stage uat -n kargo-demo -o yaml
It should show similar:
verificationInfo:
analysisRun:
namespace: kargo-demo
name: uat.01hrfdz695qqqecvrzh4csp7bm.2511465
phase: Successful
phase: Successful
AnalysisRun resource was created and was running on distributed cluster. (On the same cluster where the stage shard label defines.)
This issue has been automatically marked as stale because it had no activity for 90 days. It will be closed if no activity occurs in the next 30 days but can be reopened if it becomes relevant again.
Closing this issue, but @WZHGAHO we will likely use elements of your guide in addressing #2447