oncall icon indicating copy to clipboard operation
oncall copied to clipboard

oncall engine crashes when elasticache is used as redis provider

Open libracoder opened this issue 3 years ago • 1 comments

I am currently working to deploy ocall using our existing grafana setup using elasticache. But when i use the redis included in the charts everything works fine. When i use the elasticache, the oncall-engine pods keeps crashing.

Here is my logs and screenshotshots below

_

[uWSGI] getting INI configuration from uwsgi.ini 2022-09-16 09:52:44 *** Starting uWSGI 2.0.20 (64bit) on [Fri Sep 16 09:52:42 2022] *** 2022-09-16 09:52:44 compiled with version: 11.2.1 20220219 on 07 September 2022 11:27:59 2022-09-16 09:52:44 os: Linux-5.4.204-113.362.amzn2.x86_64 #1 SMP Wed Jul 13 21:34:30 UTC 2022 2022-09-16 09:52:44 nodename: release-oncall-engine-7b4679f59f-ggsch 2022-09-16 09:52:44 machine: x86_64 2022-09-16 09:52:44 clock source: unix 2022-09-16 09:52:44 pcre jit disabled 2022-09-16 09:52:44 detected number of CPU cores: 2 2022-09-16 09:52:44 current working directory: /etc/app 2022-09-16 09:52:44 writing pidfile to /tmp/project-master.pid 2022-09-16 09:52:44 detected binary path: /usr/local/bin/uwsgi 2022-09-16 09:52:44 uWSGI running as root, you can use --uid/--gid/--chroot options 2022-09-16 09:52:44 setgid() to 2000 2022-09-16 09:52:44 setuid() to 1000 2022-09-16 09:52:44 chdir() to /etc/app 2022-09-16 09:52:44 your memory page size is 4096 bytes 2022-09-16 09:52:44 detected max file descriptor number: 1048576 2022-09-16 09:52:44 lock engine: pthread robust mutexes 2022-09-16 09:52:44 thunder lock: disabled (you can enable it with --thunder-lock) 2022-09-16 09:52:44 uWSGI http bound on 0.0.0.0:8080 fd 7 2022-09-16 09:52:44 uwsgi socket 0 bound to TCP address 127.0.0.1:44417 (port auto-assigned) fd 6 2022-09-16 09:52:44 Python version: 3.9.13 (main, Aug 10 2022, 00:39:17) [GCC 11.2.1 20220219] 2022-09-16 09:52:44 *** Python threads support is disabled. You can enable it with --enable-threads *** 2022-09-16 09:52:44 Python main interpreter initialized at 0x7fb046c7d2c0 2022-09-16 09:52:44 your server socket listen backlog is limited to 1024 connections 2022-09-16 09:52:44 your mercy for graceful operations on workers is 60 seconds 2022-09-16 09:52:44 mapped 855306 bytes (835 KB) for 5 cores 2022-09-16 09:52:44 *** Operational MODE: preforking *** 2022-09-16 09:52:44 WSGI app 0 (mountpoint='') ready in 2 seconds on interpreter 0x7fb046c7d2c0 pid: 1 (default app) 2022-09-16 09:52:44 *** uWSGI is running in multiple interpreter mode *** 2022-09-16 09:52:44 spawned uWSGI master process (pid: 1) 2022-09-16 09:52:44 spawned uWSGI worker 1 (pid: 7, cores: 1) 2022-09-16 09:52:44 spawned uWSGI worker 2 (pid: 8, cores: 1) 2022-09-16 09:52:44 spawned uWSGI worker 3 (pid: 9, cores: 1) 2022-09-16 09:52:44 spawned uWSGI worker 4 (pid: 10, cores: 1) 2022-09-16 09:52:44 spawned uWSGI worker 5 (pid: 11, cores: 1) 2022-09-16 09:52:44 spawned uWSGI http 1 (pid: 12) 2022-09-16 09:55:09 Fri Sep 16 09:55:09 2022 - SIGPIPE: writing to a closed pipe/socket/fd (probably the client disconnected) on request /startupprobe/ (ip xxxxx) !!! 2022-09-16 09:55:12 gateway "uWSGI http 1" has been buried (pid: 12) 2022-09-16 09:55:12 ...brutally killing workers... 2022-09-16 09:55:13 worker 1 buried after 1 seconds 2022-09-16 09:55:13 worker 2 buried after 1 seconds 2022-09-16 09:55:13 worker 3 buried after 1 seconds 2022-09-16 09:55:13 worker 4 buried after 1 seconds 2022-09-16 09:55:13 worker 5 buried after 1 seconds 2022-09-16 09:55:13 binary reloading uWSGI... 2022-09-16 09:55:13 chdir() to /etc/app 2022-09-16 09:55:13 closing all non-uwsgi socket fds > 2 (max_fd = 1048576)... 2022-09-16 09:55:13 found fd 6 mapped to socket 0 (127.0.0.1:44417) 2022-09-16 09:55:13 running /usr/local/bin/uwsgi [uWSGI] getting INI configuration from uwsgi.ini 2022-09-16 09:55:15 *** Starting uWSGI 2.0.20 (64bit) on [Fri Sep 16 09:55:13 2022] *** 2022-09-16 09:55:15 compiled with version: 11.2.1 20220219 on 07 September 2022 11:27:59 2022-09-16 09:55:15 os: Linux-5.4.204-113.362.amzn2.x86_64 #1 SMP Wed Jul 13 21:34:30 UTC 2022 2022-09-16 09:55:15 nodename: release-oncall-engine-7b4679f59f-ggsch 2022-09-16 09:55:15 machine: x86_64 2022-09-16 09:55:15 clock source: unix 2022-09-16 09:55:15 pcre jit disabled 2022-09-16 09:55:15 detected number of CPU cores: 2 2022-09-16 09:55:15 current working directory: /etc/app 2022-09-16 09:55:15 detected binary path: /usr/local/bin/uwsgi 2022-09-16 09:55:15 chdir() to /etc/app 2022-09-16 09:55:15 your memory page size is 4096 bytes 2022-09-16 09:55:15 detected max file descriptor number: 1048576 2022-09-16 09:55:15 lock engine: pthread robust mutexes 2022-09-16 09:55:15 thunder lock: disabled (you can enable it with --thunder-lock) 2022-09-16 09:55:15 uWSGI http bound on 0.0.0.0:8080 fd 9 2022-09-16 09:55:15 uwsgi socket 0 bound to TCP address 127.0.0.1:33459 (port auto-assigned) fd 8 2022-09-16 09:55:15 Python version: 3.9.13 (main, Aug 10 2022, 00:39:17) [GCC 11.2.1 20220219] 2022-09-16 09:55:15 *** Python threads support is disabled. You can enable it with --enable-threads *** 2022-09-16 09:55:15 Python main interpreter initialized at 0x7fa9d74bfab0 2022-09-16 09:55:15 your server socket listen backlog is limited to 1024 connections 2022-09-16 09:55:15 your mercy for graceful operations on workers is 60 seconds 2022-09-16 09:55:15 mapped 855306 bytes (835 KB) for 5 cores 2022-09-16 09:55:15 *** Operational MODE: preforking *** 2022-09-16 09:55:15 WSGI app 0 (mountpoint='') ready in 2 seconds on interpreter 0x7fa9d74bfab0 pid: 1 (default app) 2022-09-16 09:55:15 *** uWSGI is running in multiple interpreter mode *** 2022-09-16 09:55:15 gracefully (RE)spawned uWSGI master process (pid: 1) 2022-09-16 09:55:15 spawned uWSGI worker 1 (pid: 13, cores: 1) 2022-09-16 09:55:15 spawned uWSGI worker 2 (pid: 14, cores: 1) 2022-09-16 09:55:15 spawned uWSGI worker 3 (pid: 15, cores: 1) 2022-09-16 09:55:15 spawned uWSGI worker 4 (pid: 16, cores: 1) 2022-09-16 09:55:15 spawned uWSGI worker 5 (pid: 17, cores: 1) 2022-09-16 09:55:15 spawned uWSGI http 1 (pid: 18)

_

image

libracoder avatar Sep 16 '22 10:09 libracoder

hey, we ran into an similar issue last week. AWS doesn't like the unsecure Redis Protocol and you can't use the secure protocol for a reason. i made a Pull Request to fix that. It Could fix your issue too i guess. But I'm clearly not sure... #538

Greets

Timmypedia avatar Sep 19 '22 07:09 Timmypedia

Thank for the feedback, I'll wait for your PR to get merged

libracoder avatar Sep 23 '22 06:09 libracoder

Closing this issue as #538 has been merged 😄

joeyorlando avatar Sep 27 '22 15:09 joeyorlando