oncall
oncall copied to clipboard
Problem with rabbitmq in helm chart
What went wrong?
What happened: I deployed Oncall using official helm chart with bundled RabbitMQ. For a short period of time it had been working well, but then I noticed that the RabbitMQ pod started to crash. I figured out, that it is caused by the failing liveness probe, which itself fails because of invalid credentials. The following log lines appear in the rabbitmq pod:
2023-09-27 16:37:13.055575+00:00 [warning] <0.786.0> HTTP access denied: user 'user' - invalid credentials
2023-09-27 16:37:43.045473+00:00 [warning] <0.791.0> HTTP access denied: user 'user' - invalid credentials
2023-09-27 16:38:03.740897+00:00 [warning] <0.796.0> HTTP access denied: user 'user' - invalid credentials
2023-09-27 16:38:13.044706+00:00 [warning] <0.798.0> HTTP access denied: user 'user' - invalid credentials
2023-09-27 16:38:43.041901+00:00 [warning] <0.810.0> HTTP access denied: user 'user' - invalid credentials
2023-09-27 16:39:07.745523+00:00 [warning] <0.820.0> HTTP access denied: user 'user' - invalid credentials
2023-09-27 16:39:13.048147+00:00 [warning] <0.822.0> HTTP access denied: user 'user' - invalid credentials
2023-09-27 16:39:13.095387+00:00 [warning] <0.824.0> HTTP access denied: user 'user' - invalid credentials
2023-09-27 16:39:43.050766+00:00 [warning] <0.829.0> HTTP access denied: user 'user' - invalid credentials
2023-09-27 16:39:43.099972+00:00 [warning] <0.831.0> HTTP access denied: user 'user' - invalid credentials
2023-09-27 16:40:13.047336+00:00 [warning] <0.837.0> HTTP access denied: user 'user' - invalid credentials
2023-09-27 16:40:13.096605+00:00 [warning] <0.839.0> HTTP access denied: user 'user' - invalid credentials
2023-09-27 16:40:35.735943+00:00 [warning] <0.844.0> HTTP access denied: user 'user' - invalid credentials
2023-09-27 16:40:43.049685+00:00 [warning] <0.846.0> HTTP access denied: user 'user' - invalid credentials
2023-09-27 16:40:43.106209+00:00 [warning] <0.848.0> HTTP access denied: user 'user' - invalid credentials
After some other explorations I found out that the password in rabbitmq secret is changing for unknown reason. It can be also checked by trying to login into rabbitmq web panel. The only solution for now is to drop a PVC, so that RabbitMQ reinitialize the storage to make password in env variable and in the storage to match. This only works until the next rabbitmq secret reset. I'm not sure if it is a bug of this chart or upstream one, but I didn't noticed any similar bug reports in Bitnami repository (the last more or less similar bug was fixed in 2021).
What did you expect to happen: RabbitMQ works stable and the password in the secret does not change
How do we reproduce it?
- Deploy Oncall using this chart with the bundled RabbitMQ enabled
- Wait some time until the secret is reset, so that RabbitMQ stops accept it
Grafana OnCall Version
v1.3.38
Product Area
Helm
Grafana OnCall Platform?
None
User's Browser?
No response
Anything else to add?
No response