argo-helm
argo-helm copied to clipboard
[argo-cd] Switch to bitnami/redis and bitnami/redis-cluster chart
Is your feature request related to a problem?
I have some issues with the redis-ha chart. If some pods are destroyed, they don't synchronize back well and I have to delete all the pods and wait for all of them to become ready again.
Related helm chart
argo-cd
Describe the solution you'd like
I feel like this chart should use bitnami maintained charts which are now the "default" for a major part of the community.
See https://artifacthub.io/packages/helm/bitnami/redis https://artifacthub.io/packages/helm/bitnami/redis-cluster
Describe alternatives you've considered
No response
Additional context
No response
:+1: Actually it does not work on Openshift out-of-the-box. It needs to create RoleBindings and ServiceAccount specific for Redis before.
The kustomize manifests living in the upstream project over there uses the rendered YAMLs from @dandydeveloper's chart https://github.com/argoproj/argo-cd/blob/v2.3.3/manifests/ha/base/redis-ha/chart/requirements.yaml
The intent of this helm-repository here is to use the same architecture as the upstream projects (Argo CD, Workflows, etc.). IMHO you should file an issue over there: https://github.com/argoproj/argo-cd/issues/new/choose
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
No-stale
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
@mkilchhofer What's the issue here? Redis clustering vs. Sentinel are very different so this change could impact Argo quite a lot.
Feel free to raise the issue in my chart for Redis, I'm pretty active and try my best to maintain. Right now, I'm the only real active maintainer.
Hi @DandyDeveloper,
Context: We at @swisspost tried switching on redis-ha in the Argo CD chart. We used it like 1-2 month or so our AWS EKS clusters. We use cluster autoscaling and also upgrade our clusters once a week (new AWS AMI for the workers).
Issue: One problem we saw is that one of the 3 redis pods is unhappy:
$ kubectl logs argocd-server-6499778d-2n56j
(..)
redis: 2022/03/11 10:35:37 pubsub.go:168: redis: discarding bad PubSub connection: EOF
redis: 2022/03/11 10:35:37 pubsub.go:168: redis: discarding bad PubSub connection: EOF
redis: 2022/03/11 10:35:38 pubsub.go:168: redis: discarding bad PubSub connection: EOF
time="2022-03-11T10:35:38Z" level=warning msg="Failed to resync revoked tokens. retrying again in 1 minute: EOF"
redis: 2022/03/11 10:35:38 pubsub.go:168: redis: discarding bad PubSub connection: write tcp 10.116.191.151:46704->172.20.36.205:6379: write: broken pipe
redis: 2022/03/11 10:35:38 pubsub.go:168: redis: discarding bad PubSub connection: write tcp 10.116.191.151:46704->172.20.36.205:6379: write: broken pipe
redis: 2022/03/11 10:35:38 pubsub.go:168: redis: discarding bad PubSub connection: EOF
redis: 2022/03/11 10:35:38 pubsub.go:168: redis: discarding bad PubSub connection: EOF
redis: 2022/03/11 10:35:38 pubsub.go:168: redis: discarding bad PubSub connection: EOF
redis: 2022/03/11 10:35:38 pubsub.go:168: redis: discarding bad PubSub connection: EOF
The redis logs of at least one replica was full of:
kubectl logs argocd-redis-ha-server-1 -c redis
(..)
1:S 11 Mar 2022 07:09:31.312 * Non blocking connect for SYNC fired the event.
1:S 11 Mar 2022 07:09:31.312 * Master replied to PING, replication can continue...
1:S 11 Mar 2022 07:09:31.313 * Partial resynchronization not possible (no cached master)
1:S 11 Mar 2022 07:09:31.313 * Master is currently unable to PSYNC but should be in the future: -NOMASTERLINK Can't SYNC while not connected with my master
1:S 11 Mar 2022 07:09:32.324 * Connecting to MASTER 172.20.145.11:6379
1:S 11 Mar 2022 07:09:32.324 * MASTER <-> REPLICA sync started
1:S 11 Mar 2022 07:09:32.324 * Non blocking connect for SYNC fired the event.
1:S 11 Mar 2022 07:09:32.325 * Master replied to PING, replication can continue...
1:S 11 Mar 2022 07:09:32.325 * Partial resynchronization not possible (no cached master)
1:S 11 Mar 2022 07:09:32.326 * Master is currently unable to PSYNC but should be in the future: -NOMASTERLINK Can't SYNC while not connected with my master
1:S 11 Mar 2022 07:09:33.328 * Connecting to MASTER 172.20.145.11:6379
1:S 11 Mar 2022 07:09:33.328 * MASTER <-> REPLICA sync started
1:S 11 Mar 2022 07:09:33.329 * Non blocking connect for SYNC fired the event.
(..)
Resolution: We then always fixed it like this:
$ kubectl -n argocd delete po -l app=redis-ha
pod "argocd-redis-ha-server-0" deleted
pod "argocd-redis-ha-server-1" deleted
pod "argocd-redis-ha-server-2" deleted
And after 2 monthes of annoying redis issues we switched back to single-replica redis. After that we never faced a redis related issue again.
@mkilchhofer How long ago was this?
We had a split brain scenario that was the result of bad Sentinel election. Its been resolved permanently a while back by introducing a pod for checking on this and explicitly solving split brain issues like above.
This is surface level assumption, I'd need more logs from the elected master / cluster state to provide more context.
The latest Argo should include the latest Redis chart, so, I would highly recommend trying this again.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
🎛️
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Why closed???
@mkilchhofer ?
we are seeing this issue too. not sure why it’s been closed.
I maintain the redis being used, the problem in question should be resolved long ago.
If people are experiencing problems, throw me a link to the issue or describe the issue so I can investigate.
I believe they closed this because my reply indicated things are fixed and we had no follow up.