ibm-spectrum-scale-csi
ibm-spectrum-scale-csi copied to clipboard
CSI pods restart when network policy for GUI is deleted
Describe the bug
When network policy for GUI is deleted and configmap is change in this case CSI pods restarts and sidecars go into CrashLoopBackOff
How to Reproduce?
Please list the steps to help development teams reproduce the behavior
- Install CSI with #1050 images as following:
[root@OCP network]# oc get pods
NAME READY STATUS RESTARTS AGE
csi-scale-fsetdemo-pod-5 1/1 Running 0 4d20h
ibm-spectrum-scale-csi-attacher-775c787cd7-6tlkv 1/1 Running 0 4d9h
ibm-spectrum-scale-csi-attacher-775c787cd7-w2bhl 1/1 Running 0 4d9h
ibm-spectrum-scale-csi-gp946 3/3 Running 0 4d9h
ibm-spectrum-scale-csi-mrwbj 3/3 Running 0 4d9h
ibm-spectrum-scale-csi-operator-7fb8d8f6f9-sls6c 1/1 Running 9 (32h ago) 5d
ibm-spectrum-scale-csi-pctst 3/3 Running 0 4d9h
ibm-spectrum-scale-csi-provisioner-74dc9dff59-rhdwc 1/1 Running 0 4d9h
ibm-spectrum-scale-csi-resizer-78f7684fff-46zx2 1/1 Running 0 4d9h
ibm-spectrum-scale-csi-snapshotter-5f77874594-pc9zk 1/1 Running 0 4d9h
[root@OCP network]# oc get cso
NAME VERSION SUCCESS
ibm-spectrum-scale-csi 2.10.0 True
[root@OCP network]# oc describe pod | grep quay
Image: quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver:v2.10.0-011123
Image ID: quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver@sha256:05d24a16359c9479a7917d8086a94e1ddde8918f86f75c34ba4d1f9498362ea2
Image: quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver:v2.10.0-011123
Image ID: quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver@sha256:05d24a16359c9479a7917d8086a94e1ddde8918f86f75c34ba4d1f9498362ea2
Image: quay.io/hemalatha_gajendran/tunable_host_network_latest
Image ID: quay.io/hemalatha_gajendran/tunable_host_network_latest@sha256:1e25ff135f6b7ac9cf4d845d2341c5403f210f90014ce69fc9bbb2d799812cb5
CSI_DRIVER_IMAGE: quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver:v2.10.0-011123
Image: quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver:v2.10.0-011123
Image ID: quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver@sha256:05d24a16359c9479a7917d8086a94e1ddde8918f86f75c34ba4d1f9498362ea2
- Apply Configmap as following:
[root@OCP pr1050]# cat cm.yaml
kind: ConfigMap
apiVersion: v1
metadata:
name: ibm-spectrum-scale-csi-config
namespace: ibm-spectrum-scale-csi
data:
HOST_NETWORK: DISABLED
[root@OCP pr1050]# oc apply -f cm.yaml
configmap/ibm-spectrum-scale-csi-config created
- Create network policies as following :
[root@OCP network]# oc get networkpolicy
NAME POD-SELECTOR AGE
allow-dns-acces <none> 5d3h
allow-egress-apiserver <none> 5d3h
allow-gui-route <none> 4d23h
[root@OCP network]# oc get networkpolicy -oyaml
apiVersion: v1
items:
- apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"networking.k8s.io/v1","kind":"NetworkPolicy","metadata":{"annotations":{},"name":"allow-dns-acces","namespace":"ibm-spectrum-scale-csi"},"spec":{"egress":[{"ports":[{"port":5353,"protocol":"UDP"},{"port":5353,"protocol":"TCP"}]}],"ingress":[{"ports":[{"port":5353,"protocol":"UDP"},{"port":5353,"protocol":"TCP"}]}],"policyTypes":["Egress","Ingress"]}}
creationTimestamp: "2023-11-02T04:13:33Z"
generation: 1
name: allow-dns-acces
namespace: ibm-spectrum-scale-csi
resourceVersion: "7879033"
uid: 0059c64d-4f72-4ea7-8383-7b2730c630b1
spec:
egress:
- ports:
- port: 5353
protocol: UDP
- port: 5353
protocol: TCP
ingress:
- ports:
- port: 5353
protocol: UDP
- port: 5353
protocol: TCP
podSelector: {}
policyTypes:
- Egress
- Ingress
status: {}
- apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"networking.k8s.io/v1","kind":"NetworkPolicy","metadata":{"annotations":{},"name":"allow-egress-apiserver","namespace":"ibm-spectrum-scale-csi"},"spec":{"egress":[{"ports":[{"port":443,"protocol":"TCP"},{"port":6443,"protocol":"TCP"}],"to":[{"ipBlock":{"cidr":"10.13.19.200/32"}}]},{"ports":[{"port":443,"protocol":"TCP"},{"port":6443,"protocol":"TCP"}],"to":[{"ipBlock":{"cidr":"10.13.25.22/32"}}]},{"ports":[{"port":443,"protocol":"TCP"},{"port":6443,"protocol":"TCP"}],"to":[{"ipBlock":{"cidr":"10.13.26.216/32"}}]}],"podSelector":{},"policyTypes":["Egress"]}}
creationTimestamp: "2023-11-02T04:30:15Z"
generation: 1
name: allow-egress-apiserver
namespace: ibm-spectrum-scale-csi
resourceVersion: "7885041"
uid: dc70a12b-ce0d-4953-b7ee-e865348d36aa
spec:
egress:
- ports:
- port: 443
protocol: TCP
- port: 6443
protocol: TCP
to:
- ipBlock:
cidr: 10.13.19.200/32
- ports:
- port: 443
protocol: TCP
- port: 6443
protocol: TCP
to:
- ipBlock:
cidr: 10.13.25.22/32
- ports:
- port: 443
protocol: TCP
- port: 6443
protocol: TCP
to:
- ipBlock:
cidr: 10.13.26.216/32
podSelector: {}
policyTypes:
- Egress
status: {}
- apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"networking.k8s.io/v1","kind":"NetworkPolicy","metadata":{"annotations":{},"name":"allow-gui-route","namespace":"ibm-spectrum-scale-csi"},"spec":{"egress":[{"ports":[{"port":443,"protocol":"TCP"}],"to":[{"ipBlock":{"cidr":"<OCP-IP>/32"}},{"ipBlock":{"cidr":"10.11.125.246/32"}}]}],"podSelector":{},"policyTypes":["Egress"]}}
creationTimestamp: "2023-11-02T08:48:02Z"
generation: 1
name: allow-gui-route
namespace: ibm-spectrum-scale-csi
resourceVersion: "7986486"
uid: 936c2545-3c35-4eea-88a8-3db85949eb46
spec:
egress:
- ports:
- port: 443
protocol: TCP
to:
- ipBlock:
cidr: <OCP-IP>/32
- ipBlock:
cidr: 10.11.125.246/32
podSelector: {}
policyTypes:
- Egress
status: {}
kind: List
metadata:
resourceVersion: ""
[root@OCP network]#
[root@OCP network]# kubectl get endpoints kubernetes -n default
NAME ENDPOINTS AGE
kubernetes 10.13.19.200:6443,10.13.25.22:6443,10.13.26.216:6443 18d
- Delete GUI network policy as following :
[root@OCP network]# oc delete networkpolicy allow-gui-route
networkpolicy.networking.k8s.io "allow-gui-route" deleted
- Add the configmap values here I'm changing log level to trace:
[root@OCP pr1050]# cat cm.yaml
kind: ConfigMap
apiVersion: v1
metadata:
name: ibm-spectrum-scale-csi-config
namespace: ibm-spectrum-scale-csi
data:
HOST_NETWORK: DISABLED
VAR_DRIVER_LOGLEVEL: TRACE
[root@OCP pr1050]# oc apply -f cm.yaml
configmap/ibm-spectrum-scale-csi-config configured
- Now check CSI Pods :
[root@OCP pr1050]# oc get pods
NAME READY STATUS RESTARTS AGE
ibm-spectrum-scale-csi-5j98c 3/3 Running 2 (27s ago) 2m28s
ibm-spectrum-scale-csi-64bfh 3/3 Running 2 (34s ago) 2m35s
ibm-spectrum-scale-csi-attacher-6f4ddd869b-h2tcx 1/1 Running 5 (14s ago) 2m36s
ibm-spectrum-scale-csi-attacher-6f4ddd869b-rhvtn 1/1 Running 5 (56s ago) 2m36s
ibm-spectrum-scale-csi-operator-7fb8d8f6f9-sls6c 1/1 Running 9 (33h ago) 5d
ibm-spectrum-scale-csi-provisioner-846ddb9cb8-jrjr2 0/1 CrashLoopBackOff 4 (36s ago) 2m36s
ibm-spectrum-scale-csi-q5rbz 3/3 Running 2 (30s ago) 2m31s
ibm-spectrum-scale-csi-resizer-5fb9fb844b-gwzv9 0/1 CrashLoopBackOff 4 (35s ago) 2m36s
ibm-spectrum-scale-csi-snapshotter-76f97d78fc-w7zjn 0/1 CrashLoopBackOff 4 (35s ago) 2m36s
[root@OCP pr1050]# oc get cso
NAME VERSION SUCCESS
ibm-spectrum-scale-csi 2.10.0 False
- Check CSO description :
[root@OCP pr1050]# oc describe cso
Name: ibm-spectrum-scale-csi
Namespace: ibm-spectrum-scale-csi
Labels: <none>
Annotations: <none>
API Version: csi.ibm.com/v1
Kind: CSIScaleOperator
Metadata:
Creation Timestamp: 2023-10-27T04:05:14Z
Finalizers:
finalizer.csiscaleoperators.csi.ibm.com
Generation: 2
Managed Fields:
API Version: csi.ibm.com/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:finalizers:
v:"finalizer.csiscaleoperators.csi.ibm.com":
f:ownerReferences:
k:{"uid":"33f69ae8-082a-4986-90dc-2e698a078924"}:
f:spec:
f:attacherNodeSelector:
f:clusters:
f:consistencyGroupPrefix:
f:nodeMapping:
f:pluginNodeSelector:
f:provisionerNodeSelector:
f:resizerNodeSelector:
f:snapshotterNodeSelector:
f:tolerations:
Manager: /ibm-spectrum-scale
Operation: Apply
Time: 2023-10-27T04:05:14Z
API Version: csi.ibm.com/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:finalizers:
.:
v:"finalizer.csiscaleoperators.csi.ibm.com":
f:spec:
f:consistencyGroupPrefix:
Manager: manager
Operation: Update
Time: 2023-10-27T04:05:27Z
API Version: csi.ibm.com/v1
Fields Type: FieldsV1
fieldsV1:
f:status:
.:
f:conditions:
f:versions:
Manager: CSIScaleOperator
Operation: Update
Subresource: status
Time: 2023-11-07T07:56:50Z
Owner References:
API Version: scale.spectrum.ibm.com/v1beta1
Block Owner Deletion: true
Controller: true
Kind: Cluster
Name: ibm-spectrum-scale
UID: 33f69ae8-082a-4986-90dc-2e698a078924
Resource Version: 10810726
UID: 34c6b7cc-5148-4562-a9d8-88765ff8f953
Spec:
Attacher Node Selector:
Key: scale
Value: true
Clusters:
Id: 10383269897192936933
Primary:
Primary Fs: fs0
Primary Fset: primary-fileset-fs0-10383269897192936933
Remote Cluster: 4345204465007136554
Rest API:
Gui Host: ibm-spectrum-scale-gui-ibm-spectrum-scale.apps.cnsa-shrutikanipane148a.cp.fyre.ibm.com
Secrets: ibm-spectrum-scale-gui-csiadmin
Secure Ssl Mode: false
Id: 4345204465007136554
Rest API:
Gui Host: remote-shrutikanipane148a-2.fyre.ibm.com
Gui Port: 443
Secrets: csi-remote-mount-storage-cluster-1
Secure Ssl Mode: false
Consistency Group Prefix: a5ac401f-882a-4246-a03f-55592a7a07d7
Node Mapping:
k8sNode: worker0.cnsa-shrutikanipane148a.cp.fyre.ibm.com
Spectrumscale Node: worker0
k8sNode: worker1.cnsa-shrutikanipane148a.cp.fyre.ibm.com
Spectrumscale Node: worker1
k8sNode: worker2.cnsa-shrutikanipane148a.cp.fyre.ibm.com
Spectrumscale Node: worker2
k8sNode: master0.cnsa-shrutikanipane148a.cp.fyre.ibm.com
Spectrumscale Node: master0
k8sNode: master1.cnsa-shrutikanipane148a.cp.fyre.ibm.com
Spectrumscale Node: master1
k8sNode: master2.cnsa-shrutikanipane148a.cp.fyre.ibm.com
Spectrumscale Node: master2
Plugin Node Selector:
Key: scale
Value: true
Provisioner Node Selector:
Key: scale
Value: true
Resizer Node Selector:
Key: scale
Value: true
Snapshotter Node Selector:
Key: scale
Value: true
Tolerations:
Effect: NoSchedule
Operator: Exists
Effect: NoExecute
Operator: Exists
Key: CriticalAddonsOnly
Operator: Exists
Status:
Conditions:
Last Transition Time: 2023-11-07T07:56:50Z
Message: Failed to set defaults on the instance ibm-spectrum-scale-csi. Please check Operator logs
Reason: UpdateFailed
Status: False
Type: Success
Versions:
Name: ibm-spectrum-scale-csi
Version: 2.10.0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal CSIConfigured 41s (x13 over 33h) CSIScaleOperator The CSI driver resources have been created/updated successfully
Warning UpdateFailed 41s (x4 over 11h) CSIScaleOperator Failed to set defaults on the instance ibm-spectrum-scale-csi. Please check Operator logs
Expected behavior
Above we can see CSI pods are restarting and shouldn't restart
Logs:
/scale-csi/D.1058 mustgather.tar.gz
similar type of issue is also when we there is no network policy applied and Host Network is enabled .GUI of primary fs is not reachable and we apply configmap or change the configmap
Improvement for future : There should be prerequisite check at operator before restarting driver pods when confimap is changed or applied. It will stop all CSI pod restarts
@hemalathagajendran What is the issue here ? is this expected behaviour if not we should document the same