[bitnami/mongodb] suboptimal deployment setting
Name and Version
bitnami/mongodb
What architecture are you using?
None
What steps will reproduce the bug?
Just to install mongodb with helmrelease:
---
apiVersion: v1
kind: Namespace
metadata:
name: mongo
---
apiVersion: source.toolkit.fluxcd.io/v1beta1
kind: HelmRepository
metadata:
name: bitnami
namespace: mongo
spec:
interval: 5m0s
timeout: 1m0s
url: https://charts.bitnami.com/bitnami
---
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: mongo
namespace: mongo
spec:
chart:
spec:
chart: mongodb
sourceRef:
kind: HelmRepository
name: bitnami
version: '*'
interval: 5m0s
values:
architecture: standalone
auth:
enabled: true
rootUser: root
rootPassword: "e8L85239yHVZ2jwFVzaS"
The mongo is installed and running. But then I see the next:
$ kubectl get pods -n mongo
NAME READY STATUS RESTARTS AGE
mongo-mongodb-5dc8c88457-4l4kb 1/1 Running 0 15d
mongo-mongodb-c686c8bf8-dg7qj 0/1 CrashLoopBackOff 2598 (3m17s ago) 9d
The issue is useStatefulSet: false default value. It simply does not work well. No documentation was found with a warning.
I am asking to change the default value to useStatefulSet: true which solves the issue with multiple pods of mongodb sharing the same volume.
What do you see instead?
n/a
Hi @gecube
Could you please share the logs of the MongoDB pod that's under a CrashLoopBackOff status? When using the "standalone" mode it should be possible to use a deployment.
This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.
@juan131 Hi! Unfortunately, no. I am using PVC with RWO. So effectively two Mongo Deployments pods could not co-exist simultaneously. No logs. Only in kubectl describe one can see that the PVC is claimed and the second pod could not be started. Another option could be to make the Recreate strategy for Deployment as default.
That makes sense @gecube !
Regarding alternatives, there's no need to change the update strategy. You can keep the rollingUpdate strategy but playing with maxSurge & maxUnavailable instead:
$ kubectl explain deployment.spec.strategy.rollingUpdate
GROUP: apps
KIND: Deployment
VERSION: v1
FIELD: rollingUpdate <RollingUpdateDeployment>
DESCRIPTION:
Rolling update config params. Present only if DeploymentStrategyType =
RollingUpdate.
Spec to control the desired behavior of rolling update.
FIELDS:
maxSurge <IntOrString>
The maximum number of pods that can be scheduled above the desired number of
pods. Value can be an absolute number (ex: 5) or a percentage of desired
pods (ex: 10%). This can not be 0 if MaxUnavailable is 0. Absolute number is
calculated from percentage by rounding up. Defaults to 25%. Example: when
this is set to 30%, the new ReplicaSet can be scaled up immediately when the
rolling update starts, such that the total number of old and new pods do not
exceed 130% of desired pods. Once old pods have been killed, new ReplicaSet
can be scaled up further, ensuring that total number of pods running at any
time during the update is at most 130% of desired pods.
maxUnavailable <IntOrString>
The maximum number of pods that can be unavailable during the update. Value
can be an absolute number (ex: 5) or a percentage of desired pods (ex: 10%).
Absolute number is calculated from percentage by rounding down. This can not
be 0 if MaxSurge is 0. Defaults to 25%. Example: when this is set to 30%,
the old ReplicaSet can be scaled down to 70% of desired pods immediately
when the rolling update starts. Once new pods are ready, old ReplicaSet can
be scaled down further, followed by scaling up the new ReplicaSet, ensuring
that the total number of pods available at all times during the update is at
least 70% of desired pods.
@juan131 Hi! Thanks for your suggestions, but I want to emphasise again that like DevOps engineer I am expecting that helm chart will work with default cloud like AWS (with RWO EBS volumes) out-of-box. It does not happen. It means that chart has suboptimal defaults. I am asking to consider changing them. That's all.
Hi @gecube
Thanks for your feedback, we appreciate it.
We try to offer our users Helm charts that are flexible enough to adapt to a wide range of scenarios. Please note the default values are mainly intended to try the charts on basic scenarios with simple architectures. It's your responsibility as DevOps engineer to adapt the chart values to your specific environment and requirements.
It's your responsibility as DevOps engineer to adapt the chart values to your specific environment and requirements.
completely disagree. When I take some solution from opensource, I am expecting that it will work with some default scenario. Bitnami's mongodb chart offers deployment approach which is proven to be wrong. I explained why. That's all. Also it completely breaks the trust into bitnami solution, so if bitnami will provide some enterprise or paid support, it won't create a proper relationship and trust with the client.
Hi @gecube
You're assuming a multi-node cluster with RWO dynamic volume provisioning running on AWS is a default scenario, while so many other users trying the chart may use a very different kind of cluster (e.g. a single-node local cluster using Minikube OR kind). Please note there are infinite ways to run & operate k8s cluster and it's hard to define what's the default. Users are required to read the chart documentation, understand the different alternatives parameters it offers and adapt it to their specific requirements.
That said, there's a change that we might need to introduce in the chart in the default values to make it more consistent with the "standalone" architecture. This change is the one below which can be addressed switching the current default values for podAffinityPreset & podAntiAffinityPreset:
spec:
automountServiceAccountToken: false
serviceAccountName: mongodb
affinity:
- podAntiAffinity:
+ podAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
labelSelector:
matchLabels:
app.kubernetes.io/instance: mongodb
app.kubernetes.io/name: mongodb
app.kubernetes.io/component: mongodb
topologyKey: kubernetes.io/hostname
weight: 1
This would make the solution more consistent since K8s will try to schedule new MongoDB pods in the same node while running rolling updates, avoiding issues with RWO volumes.
@juan131 It is still bad idea to have a Mongo with multiple pods running with the same volume. Why? Because running every stateful database (MongoDB, PostgreSQL) with transaction logs sharing the same data catalogue leads (in worst case) to data loss. Or in optimistic scenario - second pod will be running, but instance will see that somebody already has a lock on the data files and will just fail. But I believe that this lock is not a good protection as in case of accidental pod stop the lock file will still be present and it means that Mongo should remove it after the recovery process.
while so many other users trying the chart may use a very different kind of cluster (e.g. a single-node local cluster using Minikube OR kind).
totally agree - there are many different ways to run k8s. But they all could be classified and put into some groups or buckets.
Hi @gecube
It is still bad idea to have a Mongo with multiple pods running with the same volume. Why? Because running every stateful database (MongoDB, PostgreSQL) with transaction logs sharing the same data catalogue leads (in worst case) to data loss
The idea of the standalone architecture is to run a single MongoDB node. This architecture doesn't support horizontal scaling. To run N MongoDB instances, use the Replicaset architecture, see:
- https://github.com/bitnami/charts/tree/main/bitnami/mongodb#architecture
@juan131 Thanks for your opinion and reply.
This architecture doesn't support horizontal scaling.
If yes, why not to change the default update policy to recreate? It should resolve the issue. I think the change is really trivial and like low-hanging fruit. I could prepare PR if you want :-) As I explained before - unfortunately, MongoDB chart is not ready for use. It is very pity. As Bitnami has a very recognisable name and usually is the best choice for some kind of PoC.
Hi @gecube
I must admit I was a little bit stubborn on studying different alternatives but you convinced me. It makes sense to use Recreate with the standalone architecture so please go ahead and create the PR, I'll be glad to review it.
Small request: could we add a new validation at _helpers.tpl#L284-L300 to warn users switching the architecture to "replicaset" to change the default the update strategy type?
{{/*
Validate values of MongoDB® - must provide a valid update strategy type
*/}}
{{- define "mongodb.validateValues.updateStrategy" -}}
{{- if eq .Values.updateStrategy.type "Recreate" }}
{{- if eq .Values.architecture "replicaset" -}}
mongodb: updateStrategy.type
Only "RollingUpdate" and "OnDelete" update strategy types are supported using
the "replicaset" architecture since it is based on statefulsets.
{{- else if .Values.useStatefulSet -}}
mongodb: updateStrategy.type
By specifying "useStatefulSet=true" only "RollingUpdate" and "OnDelete"
update strategy are supported.
{{- end -}}
{{- end -}}
{{- end -}}
@juan131 Hi! Thanks! I will do and return to you.
This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.
still had no chance to work on PR, will try to reserve some time. I am kindly asking to remove stale label
Label removed. Don't worry @gecube, take all the time you need.
This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.
Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.