Jicofo does not recover after oom-kill
Description
I was using stress to provoke out of memory. JVB recovered fine (auto-restarted) but not Jicofo. Not sure if this is a bug or a feature request, but to have Jicofo recover after OOM would be good.
Current behavior
ubuntu@jitsi:~$ service jicofo status
× jicofo.service - LSB: Jitsi conference Focus
Loaded: loaded (/etc/init.d/jicofo; generated)
Active: failed (Result: oom-kill) since Sat 2024-03-30 01:31:29 UTC; 9min ago
Docs: man:systemd-sysv-generator(8)
Process: 354 ExecStart=/etc/init.d/jicofo start (code=exited, status=0/SUCCESS)
Process: 6441 ExecStop=/etc/init.d/jicofo stop (code=exited, status=0/SUCCESS)
CPU: 1min 6.406s
Mar 29 16:22:26 jitsi.xxx.nu systemd[1]: Starting LSB: Jitsi conference Focus...
Mar 29 16:22:27 jitsi.xxx.nu jicofo[354]: Starting jicofo: jicofo started.
Mar 29 16:22:27 jitsi.xxx.nu systemd[1]: Started LSB: Jitsi conference Focus.
Mar 30 01:31:27 jitsi.xxx.nu systemd[1]: jicofo.service: A process of this unit has been killed by the OOM killer.
Mar 30 01:31:29 jitsi.xxx.nu systemd[1]: jicofo.service: Failed with result 'oom-kill'.
Mar 30 01:31:30 jitsi.xxx.nu jicofo[6441]: Stopping jicofo: /etc/init.d/jicofo: 46: kill: No such process
Mar 30 01:31:30 jitsi.xxx.nu jicofo[6441]: jicofo stopped.
Mar 30 01:31:29 jitsi.xxx.nu systemd[1]: jicofo.service: Consumed 1min 6.406s CPU time.
Expected Behavior
Jifoco should recover from OOM
Possible Solution
I posted in the community and got this reply from emrah:
Looks like jitsi-videobridge has a systemd unit with restart on-failure (/lib/systemd/system/jitsi-videobridge2.service) but jicofo hasn’t… IIUC, jicofo still uses old-style initd script.
https://community.jitsi.org/t/optimize-server-for-perfomance-vs-quality-to-run-on-cheap-hardware/130843/6?u=maxf
Steps to reproduce
Use stress to provoke OOM
Environment details
Ubuntu 22 Jitsi-meet 2.0.9364
Sorry, I just see this now. I agree it should be restarted on OOM. Contributions are welcome.
Thanks for reply. Unfortunately, in this case I would not know how to fix it (ie contribute)
That's OK, let's keep this open. It's probably trivial, but I don't have the time to look into it right now.