ejabberd
ejabberd copied to clipboard
ejabberd do not start after update from 23.10 to 24.02
Hi all :-) A problem similar to this : https://github.com/processone/ejabberd/issues/3150 That unfortunately do not come resolved. (the memsup was different from my epam). If I understood properly the database come updated itself. I'm using the included database. After the mentioned update ejabberd do not start anymore with the following errors. I don know what to do... really... Please ask me all the necessary info (and how to gather them) for investigate this issue. (I tried to entirely remove the firewall but the problem persist) The server were working fine , I launched the new install (24.02) , rebooted , the issue. I'm an Arch user... Thanks at any good soul will find the time to support me ;-)
2024-04-13 13:47:50.742784+02:00 [info] <0.190.0>@ejabberd_config:load/1:82 Loading configuration from /opt/ejabberd/conf/ejabberd.yml
2024-04-13 13:47:51.785172+02:00 [info] <0.190.0>@ejabberd_config:load/1:89 Configuration loaded successfully
2024-04-13 13:52:49.244505+02:00 [notice] <0.70.0> SIGTERM received - shutting down
2024-04-13 14:01:26.057330+02:00 [error] <0.214.0> External eimp process (pid=3826) has terminated unexpectedly, restarting in a few seconds
2024-04-13 14:01:26.058274+02:00 [error] <0.216.0> External eimp process (pid=3827) has terminated unexpectedly, restarting in a few seconds
2024-04-13 14:01:26.058452+02:00 [error] <0.212.0> External eimp process (pid=3825) has terminated unexpectedly, restarting in a few seconds
2024-04-13 14:01:26.058905+02:00 [error] <0.210.0> External eimp process (pid=3824) has terminated unexpectedly, restarting in a few seconds
2024-04-13 14:01:26.064626+02:00 [error] <0.222.0>@gen_server:error_info/8:1391 ** Generic server epam terminating
** Last message in was {#Port<0.10>,{exit_status,143}}
** When Server state == {state,#Port<0.10>}
** Reason for termination ==
** port_died
2024-04-13 14:01:26.065731+02:00 [error] <0.222.0>@proc_lib:crash_report/4:584 CRASH REPORT:
crasher:
initial call: epam:init/1
pid: <0.222.0>
registered_name: epam
exception exit: port_died
in function gen_server:handle_common_reply/8 (gen_server.erl, line 1226)
ancestors: [epam_sup,<0.220.0>]
message_queue_len: 0
messages: []
links: [<0.221.0>]
dictionary: []
trap_exit: false
status: running
heap_size: 4185
stack_size: 28
reductions: 7829
neighbours:
2024-04-13 14:01:26.080804+02:00 [error] <0.221.0>@supervisor:do_restart/3:744 SUPERVISOR REPORT:
supervisor: {local,epam_sup}
errorContext: child_terminated
reason: port_died
offender: [{pid,<0.222.0>},
{id,epam},
{mfargs,{epam,start_link,[]}},
{restart_type,permanent},
{significant,false},
{shutdown,brutal_kill},
{child_type,worker}]
This is the status of the service. Seems that it stay pending the connection. I don't know how to investigate that "epam" that may cause this stuck. And do not seems related with other "epam port died" I found on the net.
[arch-nas@arch-nas ~]$ sudo systemctl status ejabberd
● ejabberd.service - XMPP Server
Loaded: loaded (/etc/systemd/system/ejabberd.service; enabled; preset: disabled)
Active: activating (start) since Sat 2024-04-13 20:48:18 CEST; 4min 24s ago
Main PID: 2353 (ejabberdctl)
Tasks: 34 (limit: 9378)
Memory: 739.5M (peak: 907.0M)
CPU: 5min 2.304s
CGroup: /system.slice/ejabberd.service
├─2353 /bin/sh /opt/ejabberd-24.02/bin/ejabberdctl foreground
├─2366 /opt/ejabberd-24.02/erts-14.2.2/bin/beam.smp -K true -P 250000 -- -root /opt/ejabberd-24.02 -bindir /opt/ejabberd-24.02/erts-14.2.2/bin -progname erl -- -h>
├─2378 /opt/ejabberd-24.02/erts-14.2.2/bin/epmd -daemon
├─2382 erl_child_setup 65536
├─2407 /opt/ejabberd-24.02/lib/eimp-1.0.22/priv/bin/eimp
├─2408 /opt/ejabberd-24.02/lib/eimp-1.0.22/priv/bin/eimp
├─2409 /opt/ejabberd-24.02/lib/eimp-1.0.22/priv/bin/eimp
├─2410 /opt/ejabberd-24.02/lib/eimp-1.0.22/priv/bin/eimp
├─2411 /opt/ejabberd-24.02/lib/epam-1.0.14/priv/bin/epam
├─2412 /opt/ejabberd-24.02/erts-14.2.2/bin/inet_gethost 4
└─2413 /opt/ejabberd-24.02/erts-14.2.2/bin/inet_gethost 4
Apr 13 20:48:21 arch-nas ejabberdctl[2366]: {child_type,supervisor}]
Apr 13 20:48:21 arch-nas ejabberdctl[2366]: 2024-04-13 20:48:21.216010+02:00 [debug] PROGRESS REPORT:
Apr 13 20:48:21 arch-nas ejabberdctl[2366]: supervisor: {local,kernel_safe_sup}
Apr 13 20:48:21 arch-nas ejabberdctl[2366]: started: [{pid,<0.404.0>},
Apr 13 20:48:21 arch-nas ejabberdctl[2366]: {id,disk_log_server},
Apr 13 20:48:21 arch-nas ejabberdctl[2366]: {mfargs,{disk_log_server,start_link,[]}},
Apr 13 20:48:21 arch-nas ejabberdctl[2366]: {restart_type,permanent},
Apr 13 20:48:21 arch-nas ejabberdctl[2366]: {significant,false},
Apr 13 20:48:21 arch-nas ejabberdctl[2366]: {shutdown,2000},
Apr 13 20:48:21 arch-nas ejabberdctl[2366]: {child_type,worker}]
At "sudo journalctl -xeu ejabberd.service" this is the output:
[arch-nas@arch-nas ~]$ sudo journalctl -xeu ejabberd.service
[sudo] password for arch-nas:
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {significant,false},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {shutdown,3000},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {child_type,worker}]
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: 2024-04-13 20:58:34.647473+02:00 [debug] PROGRESS REPORT:
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: supervisor: {local,mnesia_kernel_sup}
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: started: [{pid,<0.396.0>},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {id,mnesia_subscr},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {mfargs,{mnesia_subscr,start,[]}},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {restart_type,permanent},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {significant,false},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {shutdown,3000},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {child_type,worker}]
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: 2024-04-13 20:58:34.661363+02:00 [debug] PROGRESS REPORT:
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: supervisor: {local,mnesia_kernel_sup}
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: started: [{pid,<0.397.0>},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {id,mnesia_locker},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {mfargs,{mnesia_locker,start,[]}},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {restart_type,permanent},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {significant,false},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {shutdown,3000},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {child_type,worker}]
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: 2024-04-13 20:58:34.670528+02:00 [debug] PROGRESS REPORT:
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: supervisor: {local,mnesia_kernel_sup}
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: started: [{pid,<0.398.0>},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {id,mnesia_recover},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {mfargs,{mnesia_recover,start,[]}},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {restart_type,permanent},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {significant,false},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {shutdown,180000},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {child_type,worker}]
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: 2024-04-13 20:58:34.752914+02:00 [debug] PROGRESS REPORT:
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: supervisor: {local,kernel_safe_sup}
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: started: [{pid,<0.403.0>},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {id,disk_log_sup},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {mfargs,{disk_log_sup,start_link,[]}},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {restart_type,permanent},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {significant,false},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {shutdown,1000},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {child_type,supervisor}]
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: 2024-04-13 20:58:34.753550+02:00 [debug] PROGRESS REPORT:
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: supervisor: {local,kernel_safe_sup}
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: started: [{pid,<0.404.0>},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {id,disk_log_server},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {mfargs,{disk_log_server,start_link,[]}},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {restart_type,permanent},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {significant,false},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {shutdown,2000},
Apr 13 20:58:34 arch-nas ejabberdctl[2870]: {child_type,worker}]
I tried to substitute the "/opt/ejabberd-24.02/lib/epam-1.0.14/priv/bin/epam" with the old one 1.0.12 from the 23.10 installation but of course it do not worked ...
As you use the internal (mnesia) database, I guess you made a backup of its files before updating the program.
What happens if you uninstall ejabberd 24.02, recover back the old mnesia directory, install ejabberd 23.10 and start it: does it start correctly?
Hi Lop ! Thanks for the patience to read all this :-) I try to explain as I do operate with my hardware. Maybe is useful... I have an Asustor NAS in which I installed Arch (instead of using the stock software) I have two MX500 1TB (and other two disk but not relevant in this case). The main disk is just one , the second is passive. (not auto cloning software or similar stuff) . From time to time I update the system , reboot the machine to check everything is working and then update the ejabberd using the .run file. I install it as root and it place all the files in opt. If the update of jabber end fine and it work without errors of any kind I power off the machine , using a USB installation pen of Arch I clone with dd the entire first disk to the second disk. Like this I have a working backup but not subjected at those kind of errors. And right now I'm running this backup with the 23.10 installed ! I'm using this backup made nearly one month ago. On the first disk I tried to install and configure prosody and it after some hours of learning curve (i'm a noob) ended working fine but of course with its limitation for some defective modules.
I tried as you suggested and the errors became like this:
2024-04-17 22:46:15.105809+02:00 [error] <0.457.0>@proc_lib:crash_report/4:584 CRASH REPORT:
crasher:
initial call: ejabberd_commands:init/1
pid: <0.457.0>
registered_name: []
exception exit: {aborted,{node_not_running,ejabberd@localhost}}
in function mnesia:abort/1 (mnesia.erl, line 360)
in call from ejabberd_commands:init/1 (src/ejabberd_commands.erl, line 111)
in call from gen_server:init_it/2 (gen_server.erl, line 962)
in call from gen_server:init_it/6 (gen_server.erl, line 917)
ancestors: [ejabberd_sup,<0.179.0>]
message_queue_len: 0
messages: []
links: [<0.384.0>]
dictionary: []
trap_exit: false
status: running
heap_size: 610
stack_size: 28
reductions: 206
neighbours:
2024-04-17 22:46:15.107700+02:00 [error] <0.384.0>@supervisor:start_children/2:398 SUPERVISOR REPORT:
supervisor: {local,ejabberd_sup}
errorContext: start_error
reason: {aborted,{node_not_running,ejabberd@localhost}}
offender: [{pid,undefined},
{id,ejabberd_commands},
{mfargs,{ejabberd_commands,start_link,[]}},
{restart_type,permanent},
{significant,false},
{shutdown,60000},
{child_type,worker}]
2024-04-17 22:46:15.120800+02:00 [debug] <0.385.0>@ejabberd_systemd:terminate/2:133 Terminating ejabberd_systemd (shutdown)
2024-04-17 22:46:15.121057+02:00 [debug] <0.385.0>@ejabberd_systemd:cancel_timer/1:177 Cancelling watchdog timer
2024-04-17 22:46:15.121569+02:00 [critical] <0.179.0>@ejabberd_app:start/2:68 Failed to start ejabberd application: {error,
{shutdown,
{failed_to_start_child,
ejabberd_commands,
{aborted,
{node_not_running,
ejabberd@localhost}}}}}
I also performed an entire OPT substitution but it still fail to start. Boh , I don't know if something is gone corrupted during the last update and have damaged some libraries . Everything appeared flawless and the system appear to work fine. (I switched to the first disk for all the tests and I'm writing from it) . I tried also to reinstall all the apps and libraries mentioned in the dependencies of ejabberd (as if I would install it from source). I tried to install it also from source and it fail again. I fear to touch this disk that work. But , I can operate as follow: Use dd to backup again the old working disk. Despite the regression I'll have two working disk and then operate again the update of everything. But this time with the option to bump a copy of the database. (to a mysql extension for example). I'll proceed like this and I will report the result :-)
Hey. I know this may have come in late, but I experienced a somewhat similar issue with ejabberd v24
. The systemctl
service would hang indefinitely when I started it. There was no helpful message inside /var/log/ejabberd/ejabberd.log
.
I resorted to downgrading the package back to 23.10-1
. I was met with an error about a dependency, erlang-unixodbc
when I tried to install the older version. I was able to get it working by doing the following:
- Uninstalling the latest version of
ejabberd
. - Installing the most recent (before running
pacman -Syu
)erlang-unixodbc
-v26.2.3-1
from thepacman
cache folder. - installing the most recent version of (before running
pacman -Syu
)ejabberd
-v23.10-1
from thepacman
cache folder.
I hope this helps in some way. Here are a few links that helped me solve the problem:
Should we close this ? This does not seems to be an ejabberd / Erlang issue, but an issue with the build env ?
I added commit https://github.com/processone/ejabberd/commit/7c63cd1000f4f0b8c507baab34fb0d38512ad16e that should at least report error message when one of our apps that we have listed as our dependency (as i guess happens here with odbc) prevents ejabberd from starting, so at least we shouldn't not abort silently when this happens.