Derek Su

Results 1074 comments of Derek Su

Here is the PoC setup and result --- **Setup** 3 worker nodes for the Longhorn cluster - Attach 1 RWO volume to node-1 - Attach 2 RWO volumes to node-2...

Proceeding with the PoC tests 1. Turn into the deployment into daemonset in https://github.com/longhorn/longhorn/blob/master/examples/rwx/rwx-nginx-deployment.yaml and disable `Automatically Delete Workload Pod when The Volume Is Detached Unexpectedly`. Then, deploy the daemonset...

@rmalchow Could you please send us the support bundle for searching the clues? And please provide the following information - the problematic volume name - rough time slot when the...

@rmalchow Could you also provide the pod log and kubelet log when the pod is at crashloop state? I'd like to check why is the pod stuck.

After check the support bundle, rw timeout in instance-manager-e-96858281 at 2am. Probably network issue ``` 2021-11-25T02:00:54.415641129Z [pvc-f052a410-f19d-4fca-a0ce-936d8f23b6cb-e-aa17071b] time="2021-11-25T02:00:54Z" level=error msg="Setting replica tcp://10.42.7.124:10270 to ERR due to: r/w timeout" 2021-11-25T02:00:54.415693648Z time="2021-11-25T02:00:54Z"...

@rmalchow While you were generating the support bundle, was the rabbitmq pod still at CrashLoop state? Or, you're already restart it? I'm checking if the volume (pvc-f052a410-f19d-4fca-a0ce-936d8f23b6cb-r-cba0f30b) was changed into...

@rmalchow Well, not really sure if the "Automatically Delete Workload Pod when The Volume Is Detached Unexpectedly" happened or not. Could you please create a new support bundle if you...

I've checked the log. Both pvc-f052a410-f19d-4fca-a0ce-936d8f23b6cb and pvc-cbbdf292-cdca-47a8-a4e4 only had one disconnected replica, and the other two replicas were still running. The volumes was not crashed. No sure why ext4...

> hi derek. yes. i think this is really the question - it isn't crashed, and the IO error is about a read-only filesystem: > > ``` > 02:02:14.914 [warning]...