extra-container Add some help for startup failure

When systemd unit fails to launch, would be great to show journalctl tail.

Also, when everything is OK, maybe mention there is command sudo journalctl -M container-name to view internal container logs - activation warnings are visible only there.

Oct 02 '18 09:10 danbst

Great idea! Could you share a minimal config for a container that fails on startup?

Oct 08 '18 10:10 erikarvstedt

this fails activation on NixOS, so does in container

    services.postgresql.enable = true;
    services.postgresql.extraConfig = ";";

Oct 08 '18 11:10 danbst

extra-container add -s <<'EOF'
{
  containers.faildemo = {
      config = {
        services.postgresql.enable = true;
        services.postgresql.extraConfig = ";";
      };
  };
}
EOF

With the above config the postgresql service inside the container fails, but the container service itself keeps on running. (systemctl status container@faildemo)

For the container service to fail, its nspawn process must fail, which, for example, could happen when the container init process (first the stage 2 init script, then the systemd process) fails. This is a very unlikely outcome, that's why I asked for an example.

Did you really experience an actual container startup failure? Or are you just looking for a way to get notified of service failures inside the container?

Oct 08 '18 12:10 erikarvstedt

This is what I get when do activation with faulty configuration in NixOS:

activating the configuration...
setting up /etc...
reloading user units for danbst...
setting up tmpfiles
reloading the following units: dbus.service
restarting the following units: polkit.service
starting the following units: accounts-daemon.service
warning: the following units failed: postgresql.service

● postgresql.service - PostgreSQL Server
   Loaded: loaded (/nix/store/x3jaw8c7wql7hqzfzl3lzivgprfyik56-unit-postgresql.service/postgresql.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Mon 2018-10-08 14:58:47 EEST; 20ms ago
  Process: 17368 ExecStartPost=/nix/store/5hinhmldw0wvz6anppy6qddff41hmfl8-unit-script-postgresql-post-start (code=exited, status=1/FAILURE)
  Process: 17367 ExecStart=/nix/store/p456h4s30czb24aj1vp2hsn6bj6q6grh-unit-script-postgresql-start (code=exited, status=1/FAILURE)
  Process: 17356 ExecStartPre=/nix/store/x6pl6dp6wwgywm5y15qgz4nljxvfs7b2-unit-script-postgresql-pre-start (code=exited, status=0/SUCCESS)
 Main PID: 17367 (code=exited, status=1/FAILURE)

жов 08 14:58:47 station p456h4s30czb24aj1vp2hsn6bj6q6grh-unit-script-postgresql-start[17367]: LOG:  syntax error in file "/var/lib/postgresql/9.6/postgresql.conf" line 6, near token ";"
жов 08 14:58:47 station p456h4s30czb24aj1vp2hsn6bj6q6grh-unit-script-postgresql-start[17367]: FATAL:  configuration file "/var/lib/postgresql/9.6/postgresql.conf" contains errors
жов 08 14:58:47 station systemd[1]: postgresql.service: Main process exited, code=exited, status=1/FAILURE
жов 08 14:58:47 station sudo[17461]:     root : TTY=unknown ; PWD=/ ; USER=postgres ; COMMAND=/nix/store/h1hp1rankf5py63qs453bq418mww5hpw-postgresql-9.6.10/bin/psql --port=5432 -d postgres -c
жов 08 14:58:47 station sudo[17461]: pam_unix(sudo:session): session opened for user postgres by (uid=0)
жов 08 14:58:47 station sudo[17461]: pam_unix(sudo:session): session closed for user postgres
жов 08 14:58:47 station 5hinhmldw0wvz6anppy6qddff41hmfl8-unit-script-postgresql-post-start[17368]: /nix/store/5hinhmldw0wvz6anppy6qddff41hmfl8-unit-script-postgresql-post-start: line 3: kill: (17367) - No such process
жов 08 14:58:47 station systemd[1]: postgresql.service: Control process exited, code=exited status=1
жов 08 14:58:47 station systemd[1]: postgresql.service: Failed with result 'exit-code'.
жов 08 14:58:47 station systemd[1]: Failed to start PostgreSQL Server.
warning: error(s) occurred while switching to the new configuration

So, activation has actually failed and nixos-rebuil switch detects this. I wish container manager could detect such situation and inform right back.

But if you are talking about container failuer, then here it is:

{
   containers.db = {
     bindMounts."/db" = { hostPath = "/no-such-path"; };
     config = {};
   };
}

and current output:

$ sudo-extra-container create test-cont.nix --start
Building containers...

Installing containers:
db

Starting containers:
db

Job for [email protected] failed because the control process exited with error code.
See "systemctl  status [email protected]" and "journalctl  -xe" for details.

Oct 08 '18 13:10 danbst

I also take back my words to mention journalctl -M - no need, all it's output is available in host machine journal too.

Oct 08 '18 13:10 danbst

extra-container extra-container copied to clipboard

Add some help for startup failure

extra-container
extra-container copied to clipboard