extra-container
extra-container copied to clipboard
Add some help for startup failure
When systemd unit fails to launch, would be great to show journalctl tail.
Also, when everything is OK, maybe mention there is command sudo journalctl -M container-name
to view internal container logs - activation warnings are visible only there.
Great idea! Could you share a minimal config for a container that fails on startup?
this fails activation on NixOS, so does in container
services.postgresql.enable = true;
services.postgresql.extraConfig = ";";
extra-container add -s <<'EOF'
{
containers.faildemo = {
config = {
services.postgresql.enable = true;
services.postgresql.extraConfig = ";";
};
};
}
EOF
With the above config the postgresql service inside the container fails, but the container service itself keeps on running. (systemctl status container@faildemo
)
For the container service to fail, its nspawn process must fail, which, for example, could happen when the container init process (first the stage 2 init script, then the systemd process) fails. This is a very unlikely outcome, that's why I asked for an example.
Did you really experience an actual container startup failure? Or are you just looking for a way to get notified of service failures inside the container?
This is what I get when do activation with faulty configuration in NixOS:
activating the configuration...
setting up /etc...
reloading user units for danbst...
setting up tmpfiles
reloading the following units: dbus.service
restarting the following units: polkit.service
starting the following units: accounts-daemon.service
warning: the following units failed: postgresql.service
● postgresql.service - PostgreSQL Server
Loaded: loaded (/nix/store/x3jaw8c7wql7hqzfzl3lzivgprfyik56-unit-postgresql.service/postgresql.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Mon 2018-10-08 14:58:47 EEST; 20ms ago
Process: 17368 ExecStartPost=/nix/store/5hinhmldw0wvz6anppy6qddff41hmfl8-unit-script-postgresql-post-start (code=exited, status=1/FAILURE)
Process: 17367 ExecStart=/nix/store/p456h4s30czb24aj1vp2hsn6bj6q6grh-unit-script-postgresql-start (code=exited, status=1/FAILURE)
Process: 17356 ExecStartPre=/nix/store/x6pl6dp6wwgywm5y15qgz4nljxvfs7b2-unit-script-postgresql-pre-start (code=exited, status=0/SUCCESS)
Main PID: 17367 (code=exited, status=1/FAILURE)
жов 08 14:58:47 station p456h4s30czb24aj1vp2hsn6bj6q6grh-unit-script-postgresql-start[17367]: LOG: syntax error in file "/var/lib/postgresql/9.6/postgresql.conf" line 6, near token ";"
жов 08 14:58:47 station p456h4s30czb24aj1vp2hsn6bj6q6grh-unit-script-postgresql-start[17367]: FATAL: configuration file "/var/lib/postgresql/9.6/postgresql.conf" contains errors
жов 08 14:58:47 station systemd[1]: postgresql.service: Main process exited, code=exited, status=1/FAILURE
жов 08 14:58:47 station sudo[17461]: root : TTY=unknown ; PWD=/ ; USER=postgres ; COMMAND=/nix/store/h1hp1rankf5py63qs453bq418mww5hpw-postgresql-9.6.10/bin/psql --port=5432 -d postgres -c
жов 08 14:58:47 station sudo[17461]: pam_unix(sudo:session): session opened for user postgres by (uid=0)
жов 08 14:58:47 station sudo[17461]: pam_unix(sudo:session): session closed for user postgres
жов 08 14:58:47 station 5hinhmldw0wvz6anppy6qddff41hmfl8-unit-script-postgresql-post-start[17368]: /nix/store/5hinhmldw0wvz6anppy6qddff41hmfl8-unit-script-postgresql-post-start: line 3: kill: (17367) - No such process
жов 08 14:58:47 station systemd[1]: postgresql.service: Control process exited, code=exited status=1
жов 08 14:58:47 station systemd[1]: postgresql.service: Failed with result 'exit-code'.
жов 08 14:58:47 station systemd[1]: Failed to start PostgreSQL Server.
warning: error(s) occurred while switching to the new configuration
So, activation has actually failed and nixos-rebuil switch
detects this. I wish container manager could detect such situation and inform right back.
But if you are talking about container failuer, then here it is:
{
containers.db = {
bindMounts."/db" = { hostPath = "/no-such-path"; };
config = {};
};
}
and current output:
$ sudo-extra-container create test-cont.nix --start
Building containers...
Installing containers:
db
Starting containers:
db
Job for [email protected] failed because the control process exited with error code.
See "systemctl status [email protected]" and "journalctl -xe" for details.
I also take back my words to mention journalctl -M
- no need, all it's output is available in host machine journal too.