mayastor
mayastor copied to clipboard
grpc, rest and msp pods stuck in init
related to #1150
additional 3 nodes resolved any error messages, but pods still are stuck in init
any advice appreciated
Reporting similar issue here:
NAME READY STATUS RESTARTS AGE
core-agents-6b6bfc75-88pdm 0/1 Init:1/2 0 8m48s
csi-controller-754885db79-9x955 0/3 Init:0/1 0 8m48s
mayastor-bc6kd 1/1 Running 0 8m48s
mayastor-csi-6pg5v 1/2 CrashLoopBackOff 6 (2m51s ago) 8m48s
mayastor-csi-gfp4r 1/2 CrashLoopBackOff 6 (3m1s ago) 8m47s
mayastor-csi-h4ttr 1/2 CrashLoopBackOff 6 (2m37s ago) 8m47s
mayastor-d88f8 1/1 Running 0 8m47s
mayastor-etcd-0 0/1 Pending 0 8m48s
mayastor-etcd-1 0/1 Pending 0 8m47s
mayastor-etcd-2 0/1 Pending 0 8m47s
mayastor-slzbx 1/1 Running 0 8m47s
msp-operator-864dd49b79-rwgm6 0/1 Init:1/2 0 8m48s
nats-0 2/2 Running 0 8m48s
nats-1 2/2 Running 0 8m27s
nats-2 2/2 Running 0 8m17s
rest-765c7c6d5b-h8lxx 0/1 Init:1/2 0 8m48s
I have three 3 mayastor worker nodes.
However, I used to successfully start it once:
NAME READY STATUS RESTARTS AGE
core-agents-6b6bfc75-rxwk6 1/1 Running 0 4h1m
csi-controller-754885db79-cwfm9 3/3 Running 0 4h1m
mayastor-csi-gj9v4 1/2 CrashLoopBackOff 51 (4m8s ago) 4h1m
mayastor-csi-kvjbx 1/2 CrashLoopBackOff 51 (4m31s ago) 4h1m
mayastor-csi-qrvf5 1/2 CrashLoopBackOff 51 (3m46s ago) 4h1m
mayastor-etcd-0 1/1 Running 0 4h1m
mayastor-etcd-1 1/1 Running 0 4h1m
mayastor-etcd-2 1/1 Running 0 4h1m
mayastor-h5bxg 1/1 Running 0 4h1m
mayastor-nkvmg 1/1 Running 0 4h1m
mayastor-q2l99 1/1 Running 0 4h1m
msp-operator-864dd49b79-vfltd 1/1 Running 0 4h1m
nats-0 2/2 Running 0 4h1m
nats-1 2/2 Running 0 4h1m
nats-2 2/2 Running 0 4h1m
rest-765c7c6d5b-gk7xv 1/1 Running 0 4h1m
@tz-torchai i managed to resolve it by privileging the containers running in the namespace as follows:
apiVersion: v1
kind: Namespace
metadata:
labels:
kubernetes.io/metadata.name: mayastor
pod-security.kubernetes.io/enforce: privileged
name: mayastor
spec:
finalizers:
- kubernetes
status:
phase: Active
i would like to know if there is a better approach or if i am missing some privileges?
@tz-torchai from your first snippet: mayastor-etcd-0 0/1 Pending 0 8m48s mayastor-etcd-1 0/1 Pending 0 8m47s mayastor-etcd-2 0/1 Pending 0 8m47s Etcd is not running which is why the other pod's are in the init phase.
@wibed would you be able to reproduce it again and post a get pods (after a few minutes) ?
I fixed this by removing the following block from each affected pod with init probes.
hostNetwork: true
# To resolve services in the namespace
dnsPolicy: ClusterFirstWithHostNet