Starting et.service fails with a timeout
Hi, I'm trying to get et up and running on my desktop. I get a timeout when trying to start et.service.
Steps to reproduce:
-
I'm running Debian. Built ET from source at 28f77717cc08270c1aaa605c23eaabdc1d48cff0. I ran
make testand it passed, so naturally everything ought to work perfectly. -
I modified
/etc/systemd/system/et.serviceto pass -v 9
ExecStart=/usr/local/bin/etserver --cfgfile=/etc/et.cfg -v 9
# Note -v 9 even though the bug template says --v=9. --v=9 doesn't work, shows the command's help text.
- I tried to start the et service and got a timeout.
❯ sudo systemctl enable --now et.service
Job for et.service failed because a timeout was exceeded.
See "systemctl status et.service" and "journalctl -xe" for details.
Debug info below. Any ideas what might be going on?
❯ systemctl status et.service
● et.service - Eternal Terminal
Loaded: loaded (/etc/systemd/system/et.service; enabled; vendor preset: enabled)
Active: failed (Result: timeout) since Mon 2020-07-13 15:55:24 PDT; 17s ago
Process: 81935 ExecStart=/usr/local/bin/etserver --cfgfile=/etc/et.cfg -v 9 (code=killed, signal=TERM)
Jul 13 15:53:54 jlebar systemd[1]: Starting Eternal Terminal...
Jul 13 15:55:24 jlebar systemd[1]: et.service: start operation timed out. Terminating.
Jul 13 15:55:24 jlebar systemd[1]: et.service: Failed with result 'timeout'.
Jul 13 15:55:24 jlebar systemd[1]: Failed to start Eternal Terminal.
$ journalctl -xe
Jul 13 15:53:54 jlebar systemd[1]: Starting Eternal Terminal...
-- Subject: A start job for unit et.service has begun execution
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- A start job for unit et.service has begun execution.
--
-- The job identifier is 8285.
Jul 13 15:53:56 jlebar crudd[10583]: ToolLogProto logjam_tag=tattler_initgoogle log_path: "//corp/platforms/hostidentity/cli:hostidentity"
log_timestamp: 1594680836 host_name: "jlebar" uid: 0 language: "go" argv: "6981ff396c9e0fd860bc1f75024d37a9" argv: "--syncmissingpin=true"
argv: "fdff5850cc4c9116e95515e022b9e478" build_timestamp: 1592244617 version: "316475530" logger: "crudd" os: LINUX_OS
Jul 13 15:53:56 jlebar nsscacheclient[81951]: INFO : 2020/07/13 15:53:56.941957 main.go:79: Loading GECC credential.
Jul 13 15:53:56 jlebar nsscacheclient[81951]: INFO : 2020/07/13 15:53:56.971633 client_netgroupsyncer.go:87: netgroup: start syncing from "
da94b4ed4b47baaa57c55e78ae45a0a72f54f511"
Jul 13 15:53:57 jlebar nsscacheclient[81951]: INFO : 2020/07/13 15:53:57.174638 client_netgroupsyncer.go:131: netgroup: version "da94b4ed4b
47baaa57c55e78ae45a0a72f54f511" is latest.
Jul 13 15:53:57 jlebar nsscacheclient[81951]: INFO : 2020/07/13 15:53:57.175169 main.go:225: SUCCESS: Completed run (2020.07.06.RC00 start:
1594680836.939 rtime:0.235 utime:0.024 stime:0.012 maxrss:29032k dials:1 sent:3539 rcvd:4778 syncer:netgroup).
Jul 13 15:54:01 jlebar CRON[81973]: pam_unix(cron:session): session opened for user root by (uid=0)
Jul 13 15:54:01 jlebar audit[81973]: USER_START pid=81973 uid=0 auid=0 ses=105 subj==unconfined msg='op=PAM:session_open grantors=pam_env,p
am_permit,pam_limits,pam_unix,pam_limits,pam_loginuid acct="root" exe="/usr/sbin/cron" hostname=? addr=? terminal=cron res=success'
Jul 13 15:54:01 jlebar audispd[870]: node=jlebar type=USER_START msg=audit(1594680841.134:2899): pid=81973 uid=0 auid=0
ses=105 subj==unconfined msg='op=PAM:session_open grantors=pam_env,pam_permit,pam_limits,pam_unix,pam_limits,pam_loginuid acct="root" exe="/usr/sbin/cron" hos
tname=? addr=? terminal=cron res=success'
Jul 13 15:54:01 jlebar CRON[81974]: (root) CMD (/usr/bin/corp_cronwrap -t 120 /usr/sbin/corpdns_ddns_client)
Jul 13 15:54:01 jlebar /usr/bin/lock[81985]: called by /usr/bin/bash for . uid 0, euid 0.
Jul 13 15:54:13 jlebar collectd[2013]: plugin_dispatch_values: Dataset not found: io_ops (from "jlebar/processes-murdockd/io_ops"), check your types.db!
Jul 13 15:54:13 jlebar collectd[2013]: plugin_dispatch_values: Dataset not found: io_ops (from "jlebar/processes-collectd/io_ops"), check your types.db!
Jul 13 15:54:15 jlebar CRON[81973]: pam_unix(cron:session): session closed for user root
Jul 13 15:54:48 jlebar crudd[10583]: ToolLogProto logjam_tag=tattler_initgoogle log_path: "//corp/platforms/hostidentity/cli:hostidentity" log_timestamp: 1594680888 host_name: "jlebar" uid: 0 language: "go" argv: "6981ff396c9e0fd860bc1f75024d37a9" argv: "--syncmissingpin=true" argv: "fdff5850cc4c9116e95515e022b9e478" build_timestamp: 1592244617 version: "316475530" logger: "crudd" os: LINUX_OS
Jul 13 15:54:48 jlebar nsscacheclient[82047]: INFO : 2020/07/13 15:54:48.827839 main.go:79: Loading GECC credential.
Jul 13 15:54:48 jlebar nsscacheclient[82047]: INFO : 2020/07/13 15:54:48.857159 client_netgroupsyncer.go:87: netgroup: start syncing from "da94b4ed4b47baaa57c55e78ae45a0a72f54f511"
Jul 13 15:54:49 jlebar nsscacheclient[82047]: INFO : 2020/07/13 15:54:49.024729 client_netgroupsyncer.go:131: netgroup: version "da94b4ed4b47baaa57c55e78ae45a0a72f54f511" is latest.
Jul 13 15:54:49 jlebar nsscacheclient[82047]: INFO : 2020/07/13 15:54:49.025258 main.go:225: SUCCESS: Completed run (2020.07.06.RC00 start:1594680888.822 rtime:0.202 utime:0.017 stime:0.025 maxrss:29032k dials:1 sent:3539 rcvd:4778 syncer:netgroup).
Jul 13 15:55:01 jlebar CRON[82078]: pam_unix(cron:session): session opened for user root by (uid=0)
Jul 13 15:55:01 jlebar audit[82078]: USER_START pid=82078 uid=0 auid=0 ses=106 subj==unconfined msg='op=PAM:session_open grantors=pam_env,pam_permit,pam_limits,pam_unix,pam_limits,pam_loginuid acct="root" exe="/usr/sbin/cron" hostname=? addr=? terminal=cron res=success'
Jul 13 15:55:01 jlebar audispd[870]: node=jlebar type=USER_START msg=audit(1594680901.434:2900): pid=82078 uid=0 auid=0 ses=106 subj==unconfined msg='op=PAM:session_open grantors=pam_env,pam_permit,pam_limits,pam_unix,pam_limits,pam_loginuid acct="root" exe="/usr/sbin/cron" hostname=? addr=? terminal=cron res=success'
Jul 13 15:55:01 jlebar CRON[82079]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Jul 13 15:55:01 jlebar CRON[82078]: pam_unix(cron:session): session closed for user root
Jul 13 15:54:48 jlebar crudd[10583]: ToolLogProto logjam_tag=tattler_initgoogle log_path: "//corp/platforms/hostidentity/cli:hostidentity" log_timestamp: 1594680888 host_name: "jlebar" uid: 0 language: "go" argv: "6981ff396c9e0fd860bc1f75024d37a9" argv: "--syncmissingpin=true" argv: "fdff5850cc4c9116e95515e022b9e478" build_timestamp: 1592244617 version: "316475530" logger: "crudd" os: LINUX_OS
Jul 13 15:54:48 jlebar nsscacheclient[82047]: INFO : 2020/07/13 15:54:48.827839 main.go:79: Loading GECC credential.
Jul 13 15:54:48 jlebar nsscacheclient[82047]: INFO : 2020/07/13 15:54:48.857159 client_netgroupsyncer.go:87: netgroup: start syncing from "da94b4ed4b47baaa57c55e78ae45a0a72f54f511"
Jul 13 15:54:49 jlebar nsscacheclient[82047]: INFO : 2020/07/13 15:54:49.024729 client_netgroupsyncer.go:131: netgroup: version "da94b4ed4b47baaa57c55e78ae45a0a72f54f511" is latest.
Jul 13 15:54:49 jlebar nsscacheclient[82047]: INFO : 2020/07/13 15:54:49.025258 main.go:225: SUCCESS: Completed run (2020.07.06.RC00 start:1594680888.822 rtime:0.202 utime:0.017 stime:0.025 maxrss:29032k dials:1 sent:3539 rcvd:4778 syncer:netgroup).
Jul 13 15:55:01 jlebar CRON[82078]: pam_unix(cron:session): session opened for user root by (uid=0)
Jul 13 15:55:01 jlebar audit[82078]: USER_START pid=82078 uid=0 auid=0 ses=106 subj==unconfined msg='op=PAM:session_open grantors=pam_env,pam_permit,pam_limits,pam_unix,pam_limits,pam_loginuid acct="root" exe="/usr/sbin/cron" hostname=? addr=? terminal=cron res=success'
Jul 13 15:55:01 jlebar audispd[870]: node=jlebar type=USER_START msg=audit(1594680901.434:2900): pid=82078 uid=0 auid=0 ses=106 subj==unconfined msg='op=PAM:session_open grantors=pam_env,pam_permit,pam_limits,pam_unix,pam_limits,pam_loginuid acct="root" exe="/usr/sbin/cron" hostname=? addr=? terminal=cron res=success'
Jul 13 15:55:01 jlebar CRON[82079]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Jul 13 15:55:01 jlebar CRON[82078]: pam_unix(cron:session): session closed for user root
Jul 13 15:55:13 jlebar collectd[2013]: plugin_dispatch_values: Dataset not found: io_ops (from "jlebar/processes-murdockd/io_ops"), check your types.db!
Jul 13 15:55:13 jlebar collectd[2013]: plugin_dispatch_values: Dataset not found: io_ops (from "jlebar/processes-collectd/io_ops"), check your types.db!
Jul 13 15:55:24 jlebar systemd[1]: et.service: start operation timed out. Terminating.
Jul 13 15:55:24 jlebar systemd[1]: et.service: Failed with result 'timeout'.
-- Subject: Unit failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- The unit et.service has entered the 'failed' state with result 'timeout'.
Jul 13 15:55:24 jlebar systemd[1]: Failed to start Eternal Terminal.
-- Subject: A start job for unit et.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- A start job for unit et.service has finished with a failure.
--
-- The job identifier is 8285 and the job result is failed.
Jul 13 15:55:24 jlebar audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=et comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Jul 13 15:55:24 jlebar audispd[870]: node=jlebar type=SERVICE_START msg=audit(1594680924.241:2901): pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=et comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Jul 13 15:55:24 jlebar sudo[81911]: pam_unix(sudo:session): session closed for user root
Jul 13 15:55:42 jlebar systemd[1]: Configuration file /etc/systemd/system/et.service is marked world-inaccessible. This has no effect as configuration data is accessible via APIs without restrictions. Proceeding anyway.
/tmp/etserver* does not exist, so nothing to copy there.
What happens when you run etserver manually (as root)?
● et.service - Eternal Terminal
Loaded: loaded (/etc/systemd/system/et.service; enabled; vendor preset: disabled)
Active: failed (Result: timeout) since Sun 2020-09-27 21:51:38 PDT; 12s ago
Process: 18880 ExecStart=/usr/local/bin/etserver --daemon --cfgfile=/etc/et.cfg (code=exited, status=0/SUCCESS)
Sep 27 21:50:08 sapco systemd[1]: Starting Eternal Terminal...
Sep 27 21:50:08 sapco systemd[1]: PID file /run/etserver.pid not readable (yet?) after start.
Sep 27 21:51:38 sapco systemd[1]: et.service start operation timed out. Terminating.
Sep 27 21:51:38 sapco systemd[1]: Failed to start Eternal Terminal.
Sep 27 21:51:38 sapco systemd[1]: Unit et.service entered failed state.
Sep 27 21:51:38 sapco systemd[1]: et.service failed.
I'm seeing the same error.
Running etserver leads to Aborted
Yeah we need some logs from running etserver --logtostdout -v 9 --cfgfile=/etc/et.cfg directly to get systemd out of the equation.
@Overload119 you need to remove the --daemon option from your ExecStart as that causes the etserver process to fork and systemd loses the original process. This has been updated in the documentation.