scylla-operator
scylla-operator copied to clipboard
scylla complex interactions between supervisord and k8s container liveness
Describe the bug
Scylla runs under supervisord. The default configuration in supervisord is autorestart=unexpected this means that supervisord will attempt restart a process that fails with a nonzero exit code (exitcodes=0 by default). In the event that the scylla process fails due to an exception supervisord will immediately restart scylla.
Normally this is fine, but I think it's operationally confusing. Shound scylla fail with a non-zero exit I would expect the container to fail liveness and then be restarted by kubernetes thus incrementing the restart counter. With the current configuration this failure would happen but only scylla logs/metrics would indicate that's the case. This can be observed by running a kill -9 vs the scylla process in a running container.
If a scylla replica has something terrible happen to its internals (eg storage corruption) and the scylla process is failing in the init sequence then scylla will crash loop as the init will throw and exception scylla will stop and then supervisord will immediately restart it. The container will remain running but scylla will in a starting/stopping crashoop with supervisord. You can observe this by deleting /var/lib/scylla on an active replica and then issuing a kill -9 to the scylla process.
It's not clear to me what the best way to handle this is? Maybe it should just be accepted behavior? Naively, I would expect supervisord to NOT restart scylla and rely on kubernetes liveness to restart the container rather then restarting the scylla process inside the container with supervisord, however perhaps there's a valid operational reason to leave this the way it is?
Expected behavior When scylla fails it should be restarted by k8s liveness not supervisord.
Environment:
- Platform: any
- Kubernetes version: any
- Scylla version: 4.4/4.5
- Scylla-operator version: any
I think a lot of it is just historical and because the same scylla container was/is used without orchestration.
When scylla fails it should be restarted by k8s liveness not supervisord. If we force the supervisord to actually exit I think the operator "sidecar", that spawns scylla, should exit as well and the container would restart. (No need for probes.)
This caught me last week - the scylladb instance for scylla-manager was hitting its memory limit, getting OOM killed, and restarting. It would come back up quickly enough that my monitoring didn't alert for it being down (and since the pod didn't restart, my pod restart monitoring didn't catch it).