couchdb-helm
couchdb-helm copied to clipboard
Container couchdb is going into restart loop right after deploy without any logs
Describe the bug
Upon executing helm install couchdb -n couchdb couchdb/couchdb -f values.yaml, the main container enters a continuous restart loop, lacking explanatory logs. This issue surfaces when persistence is enabled; without it, the container starts successfully. The PVC and PV are properly created, mounted and writable ( i tested from another container ).
Experimenting with a custom Deployment resulted in same behaviour. Consequently, the issue could originate from my storage configuration or permissions and how the docker container or the software expects them. It's noteworthy that other applications (Prometheus, RabbitMQ) operate without issues on the same storage. cluster, helm.
Any information or further steps will be appreciated. Thank you!
Version of Helm and Kubernetes:
Kubernetes
Provider: Amazon EKS, Kubernetes Version: v1.24.13 -0a21954
Helm:
version.BuildInfo{Version:"v3.9.4", GitCommit:"dbc6d8e20fe1d58d50e6ed30f09a04a77e4c68db", GitTreeState:"clean", GoVersion:"go1.17.13"}
StorageClass:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: efs-storage-class-wa
allowVolumeExpansion: true
parameters:
basePath: /dynamic_provisioning
directoryPerms: '700'
fileSystemId: <fs>
gidRangeEnd: '2000'
gidRangeStart: '1000'
provisioningMode: efs-ap
provisioner: efs.csi.aws.com
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
What happened:
The StatefulSet is unable to start with Amzon EFS Persistence Storage
How to reproduce it (as minimally and precisely as possible):
Create EFS Storage on EKS and deploy following the guide in the README.
Anything else we need to know:
values.yaml
# -- the initial number of nodes in the CouchDB cluster.
clusterSize: 1
persistentVolume:
enabled: true
storageClass: "efs-storage-class-wa"
accessModes:
- ReadWriteOnce
size: 10Gi
networkPolicy:
enabled: false
image:
tag: 3.3.2
dns:
clusterDomainSuffix: cluster.local
service:
enabled: true
prometheusPort:
enabled: true
bind_address: "0.0.0.0"
port: 8080
couchdbConfig:
chttpd:
bind_address: any
require_valid_user: false
couchdb:
uuid: 4714aa87edb4be946671309fbec8941a
kubectl describe pod couchdb-couchdb-0 -n couchdb-qa
Name: couchdb-qa-couchdb-0
Namespace: couchdb-qa
Priority: 0
Node: ip-10-152-181-13.eu-west-1.compute.internal/10.152.181.13
Start Time: Wed, 07 Jun 2023 12:34:11 +0300
Labels: app=couchdb
controller-revision-hash=couchdb-qa-couchdb-b6c8db589
release=couchdb-qa
statefulset.kubernetes.io/pod-name=couchdb-qa-couchdb-0
Status: Running
Controlled By: StatefulSet/couchdb-qa-couchdb
Init Containers:
init-copy:
Container ID: containerd://de3c35142624b77f0c8abcca439f5b436ac0a23666e88cf0a5274f00e6558ca8
Image: busybox:latest
Image ID: docker.io/library/busybox@sha256:560af6915bfc8d7630e50e212e08242d37b63bd5c1ccf9bd4acccf116e262d5b
Port: <none>
Host Port: <none>
Command:
sh
-c
cp /tmp/chart.ini /default.d; cp /tmp/seedlist.ini /default.d; cp /tmp/prometheus.ini /default.d; ls -lrt /default.d;
State: Terminated
Reason: Completed
Exit Code: 0
Started: Wed, 07 Jun 2023 12:34:13 +0300
Finished: Wed, 07 Jun 2023 12:34:13 +0300
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/default.d from config-storage (rw)
/tmp/ from config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-w5kzb (ro)
Containers:
couchdb:
Container ID: containerd://ff97bde75fd9ce3ea58d962bb8aa8e35902af2584bea4ac16ba0317d60b35a1f
Image: couchdb:3.3.2
Image ID: docker.io/library/couchdb@sha256:efd8eefd6e849ac88a5418bd4e633002e9f665fd6b16c3eb431656984203cfec
Ports: 5984/TCP, 4369/TCP, 9100/TCP, 8080/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Wed, 07 Jun 2023 12:34:14 +0300
Finished: Wed, 07 Jun 2023 12:34:14 +0300
Ready: False
Restart Count: 1
Liveness: http-get http://:5984/_up delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:5984/_up delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
COUCHDB_USER: <set to the key 'adminUsername' in secret 'couchdb-qa-couchdb'> Optional: false
COUCHDB_PASSWORD: <set to the key 'adminPassword' in secret 'couchdb-qa-couchdb'> Optional: false
COUCHDB_SECRET: <set to the key 'cookieAuthSecret' in secret 'couchdb-qa-couchdb'> Optional: false
COUCHDB_ERLANG_COOKIE: <set to the key 'erlangCookie' in secret 'couchdb-qa-couchdb'> Optional: false
ERL_FLAGS: -name couchdb -setcookie XXXXXXXXXXX
Mounts:
/opt/couchdb/data from database-storage (rw)
/opt/couchdb/etc/default.d from config-storage (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-w5kzb (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
database-storage:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: database-storage-couchdb-qa-couchdb-0
ReadOnly: false
config-storage:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: couchdb-qa-couchdb
Optional: false
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4s default-scheduler Successfully assigned couchdb-qa/couchdb-qa-couchdb-0 to node1
Normal Pulling 3s kubelet Pulling image "busybox:latest"
Normal Pulled 2s kubelet Successfully pulled image "busybox:latest" in 593.012622ms
Normal Created 2s kubelet Created container init-copy
Normal Started 2s kubelet Started container init-copy
Normal Created 1s (x2 over 2s) kubelet Created container couchdb
Normal Started 1s (x2 over 2s) kubelet Started container couchdb
Warning BackOff <invalid> (x4 over 0s) kubelet Back-off restarting failed container
kubectl logs couchdb-qa-couchdb-0 -n couchdb-qa
Defaulted container "couchdb" out of: couchdb, init-copy (init)
kubectl logs couchdb-qa-couchdb-0 --container init-copy -n couchdb-qa
total 12
-rw-r--r-- 1 root root 98 Jun 7 09:34 seedlist.ini
-rw-r--r-- 1 root root 71 Jun 7 09:34 prometheus.ini
-rw-r--r-- 1 root root 106 Jun 7 09:34 chart.ini
We are also experiencing this same issue when trying to go to 3.3.2
. We've been able to successfully go to 3.2.1
for the time being.
We are also experiencing this same issue when trying to go to
3.3.2
. We've been able to successfully go to3.2.1
for the time being.
That is not helping in my case. I get the same behaviour with different versions. Even 2.X.X
I have a same problem described in here https://github.com/apache/couchdb-helm/issues/123 Installing version back to 3.2.1 did not solve the problem.
My guess is that this is a permissions issue. If you can reproduce in a test environment, I would see whether you can get the container running using a custom command e.g. update the deployment to set:
command: [ "/bin/bash", "-c", "--" ]
args: [ "while true; do sleep 30; done;" ]
and then exec into the container. The standard container entrypoint is defined at https://github.com/apache/couchdb-docker/blob/main/3.3.2/docker-entrypoint.sh, so you could try running that manually from a shell and see whether any commands fail.
I am not sure how to change an entrypoint of a docker image that is being deployed via helm chart. Values.yaml doesn't give me such a possibility...
@lolszowy I would just kubectl edit
the deployment manifest directly after deploying with Helm.
As the author of the issue I am sorry, but currently I don't have much time to invest in it. As soon as I can I will proceed with further testing too. I tested with different storage class ( Amazon EBS and Longhorn ) and it was working as expected.
problem is definitely with mounting pv. There is no problem with running CouchDB without PV. But even when I am trying to create a local pv it is crashing same way.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: couchdb-statefulset
spec:
selector:
matchLabels:
app: couchdb
serviceName: couchdb-service
replicas: 1
template:
metadata:
labels:
app: couchdb
spec:
containers:
- name: couchdb
image: couchdb:3.3.1
ports:
- containerPort: 5984
volumeMounts:
- name: couchdb-data
mountPath: /opt/couchdb/
volumes:
- name: couchdb-data
persistentVolumeClaim:
claimName: couchdb-pvc-local
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: couchdb-pvc-local
spec:
storageClassName: local-storage
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-local-pv
spec:
capacity:
storage: 10Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
storageClassName: local-storage
local:
path: /mnt
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- ip-10-3-1-81.eu-west-1.compute.internal
- ip-10-3-2-247.eu-west-1.compute.internal
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
k describe pod couchdb-statefulset-0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 8s default-scheduler 0/3 nodes are available: persistentvolumeclaim "couchdb-pvc-local" not found. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod..
Normal Scheduled 5s default-scheduler Successfully assigned tpos-sync/couchdb-statefulset-0 to ip-10-3-2-247.eu-west-1.compute.internal
Normal Pulled 3s (x2 over 4s) kubelet Container image "couchdb:3.3.1" already present on machine
Normal Created 3s (x2 over 4s) kubelet Created container couchdb
Normal Started 3s (x2 over 4s) kubelet Started container couchdb
Warning BackOff 2s kubelet Back-off restarting failed container couchdb in pod couchdb-statefulset-0_tpos-sync(92a8c35d-c8e1-4dee-b745-a8f4be50c106)
While using Helm Chart I had that error _ Warning FailedScheduling 62s default-scheduler 0/3 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod.._