cloud-init icon indicating copy to clipboard operation
cloud-init copied to clipboard

cloud-init status complains about systemd unit failures, only init stage modules are executed - Debian Testing/Unstable/Kali

Open dominikborkowski opened this issue 1 year ago • 2 comments

Bug report

This issue first appeared in cloud-init 24.3-1 in Debian Testing & Kali, then the error message changed in 24.4-1, but the other symptoms remained:

  • everything worked fine with packaged cloud-init 22.4-1
  • after upgrading to 24.3-1, and later from 22.4-1 to 24.4-1 cloud-init status began showing 'status: error'
  • there are seemingly no messages with word 'error' in cloud-init logs
  • most important, only modules from the 'init' stage are executed, and none of the config nor final ones are

With 24.3-1 package the error was:

Unsupported configuration: boot stage called by PID [644] outside of systemd is deprecated in 24.3 and scheduled to be removed in 29.3. Triggering cloud-init boot stages outside of intial system boot is not a fully supported operation which can lead  to incomplete or incorrect configuration. "

With the 24.4-1 package the error now is:

Failed due to systemd unit failure. Ensure all cloud-init services are enabled, and check 'systemctl' or 'journalctl' for more information.

Steps to reproduce the problem

We've been working with AWS images

  • boot "ami-061b17d332829ab1c" # kali-last-snapshot-amd64-2024.3.0-804fcc46-63fc-4eb6-85a1-50e66d6c7215 (us-east-1)
  • check that everything works as expected on existing cloud-init 22.4
  • either perform full system upgrade, or just apt update && apt install cloud-init to get the latest version
  • reboot, try cloud-init status --long, and check logs

None of our scripts in /var/lib/cloud/scripts/per-* execute as a result of this issue,

Environment details

  • Cloud-init version: 24.4 (and previously 24.3)
  • Operating System Distribution: Kali Linux, based on Debian Testing
  • Cloud provider, platform or installer type: AWS

cloud-init logs

Attaching to the issue.

Couple additional data points

Upgrading the package:

┌──(root㉿terminal.example.com)-[~]
└─# apt upgrade
The following package was automatically installed and is no longer required:
  python3-serial
Use 'apt autoremove' to remove it.

Upgrading:
  cloud-init

Summary:
  Upgrading: 1, Installing: 0, Removing: 0, Not Upgrading: 0
  Download size: 654 kB
  Space needed: 84.0 kB / 4566 MB available

Continue? [Y/n] y
Get:1 http://kali.download/kali kali-rolling/main amd64 cloud-init all 24.4-1 [654 kB]
Fetched 654 kB in 0s (7206 kB/s)   
(Reading database ... 117950 files and directories currently installed.)
Preparing to unpack .../cloud-init_24.4-1_all.deb ...
Unpacking cloud-init (24.4-1) over (24.2-1) ...
dpkg: warning: unable to delete old directory '/etc/systemd/system/[email protected]': Directory not empty
Setting up cloud-init (24.4-1) ...

Configuration file '/etc/cloud/cloud.cfg'
 ==> Modified (by you or by a script) since installation.
 ==> Package distributor has shipped an updated version.
   What would you like to do about it ?  Your options are:
    Y or I  : install the package maintainer's version
    N or O  : keep your currently-installed version
      D     : show the differences between the versions
      Z     : start a shell to examine the situation
 The default action is to keep your current version.
*** cloud.cfg (Y/I/N/O/D/Z) [default=N] ? Y
Installing new version of config file /etc/cloud/cloud.cfg ...
update-rc.d: We have no instructions for the cloud-init-main init script.
update-rc.d: It looks like a non-network service, we enable it.
┌──(root㉿terminal.example.com)-[/boot/.cr]
└─# cloud-init status --long
status: error
extended_status: error - done
boot_status_code: enabled-by-generator
last_update: Thu, 01 Jan 1970 00:00:09 +0000
detail: Failed due to systemd unit failure
errors:
- Failed due to systemd unit failure. Ensure all cloud-init services are enabled, and check 'systemctl' or 'journalctl' for more information.
recoverable_errors: {}

┌──(root㉿terminal.example.com)-[/boot/.cr]
└─# systemctl status cloud*
● cloud-final.service - Cloud-init: Final Stage
     Loaded: loaded (/usr/lib/systemd/system/cloud-final.service; enabled; preset: enabled)
     Active: active (exited) since Fri 2024-12-13 15:28:59 UTC; 30min ago
 Invocation: a0e5d1ab9e3948fca0b06eec175cc206
    Process: 796 ExecStart=sh -c echo "start" | netcat -Uu -W1 /run/cloud-init/share/final.sock -s /run/cloud-init/share/final-return.sock | sh (code=exited, status=0/SUCCESS)
   Main PID: 796 (code=exited, status=0/SUCCESS)
   Mem peak: 1.5M
        CPU: 19ms

Dec 13 15:28:59 terminal.example.com systemd[1]: Starting cloud-final.service - Cloud-init: Final Stage...
Dec 13 15:28:59 terminal.example.com sh[798]: netcat: /run/cloud-init/share/final-return.sock: No such file or directory
Dec 13 15:28:59 terminal.example.com systemd[1]: Finished cloud-final.service - Cloud-init: Final Stage.

● cloud-init.target - Cloud-init target
     Loaded: loaded (/usr/lib/systemd/system/cloud-init.target; enabled-runtime; preset: enabled)
     Active: active since Fri 2024-12-13 15:28:59 UTC; 30min ago
 Invocation: ebf2072954da46cf868218a3cadc8cb2

Dec 13 15:28:59 terminal.example.com systemd[1]: Reached target cloud-init.target - Cloud-init target.

● cloud-config.target - Cloud-config availability
     Loaded: loaded (/usr/lib/systemd/system/cloud-config.target; static)
     Active: active since Fri 2024-12-13 15:28:57 UTC; 30min ago
 Invocation: 9c3716ac3f6442eab138e5c87f366b61

Dec 13 15:28:57 terminal.example.com systemd[1]: Reached target cloud-config.target - Cloud-config availability.

● cloud-config.service - Cloud-init: Config Stage
     Loaded: loaded (/usr/lib/systemd/system/cloud-config.service; enabled; preset: enabled)
     Active: active (exited) since Fri 2024-12-13 15:28:57 UTC; 30min ago
 Invocation: 0a286c4a48814de491a3362491287ca9
    Process: 634 ExecStart=sh -c echo "start" | netcat -Uu -W1 /run/cloud-init/share/config.sock -s /run/cloud-init/share/config-return.sock | sh (code=exited, status=0/SUCCESS)
   Main PID: 634 (code=exited, status=0/SUCCESS)
   Mem peak: 1.6M
        CPU: 19ms

Dec 13 15:28:57 terminal.example.com systemd[1]: Starting cloud-config.service - Cloud-init: Config Stage...
Dec 13 15:28:57 terminal.example.com sh[640]: netcat: /run/cloud-init/share/config-return.sock: No such file or directory
Dec 13 15:28:57 terminal.example.com systemd[1]: Finished cloud-config.service - Cloud-init: Config Stage.

● cloud-init.service - LSB: Cloud init
     Loaded: loaded (/etc/init.d/cloud-init; generated)
     Active: active (running) since Fri 2024-12-13 15:28:58 UTC; 30min ago
 Invocation: 39dcf47fa54a44bd80c77637be8f9703
       Docs: man:systemd-sysv-generator(8)
    Process: 635 ExecStart=/etc/init.d/cloud-init start (code=exited, status=0/SUCCESS)
      Tasks: 1 (limit: 2296)
     Memory: 21.4M (peak: 53.4M)
        CPU: 976ms
     CGroup: /system.slice/cloud-init.service
             └─724 dhclient

Dec 13 15:28:58 terminal.example.com dhclient[724]: DHCPACK of 10.1.143.188 from 10.1.128.1
Dec 13 15:28:58 terminal.example.com cloud-init[669]: CR-Route: Received BOUND signal for interface eth0, with IP 10.1.143.188
Dec 13 15:28:58 terminal.example.com cloud-init[669]: CR-Route: Using router from eth0 as a default gateway:
Dec 13 15:28:58 terminal.example.com cloud-init[669]: CR-Route:  ip route replace default via 10.1.128.1
Dec 13 15:28:58 terminal.example.com dhclient[724]: bound to 10.1.143.188 -- renewal in 1779 seconds.
Dec 13 15:28:58 terminal.example.com cloud-init[635]: .
Dec 13 15:28:58 terminal.example.com systemd[1]: Started cloud-init.service - LSB: Cloud init.
Dec 13 15:58:37 terminal.example.com dhclient[724]: DHCPREQUEST for 10.1.143.188 on eth0 to 10.1.128.1 port 67
Dec 13 15:58:37 terminal.example.com dhclient[724]: DHCPACK of 10.1.143.188 from 10.1.128.1
Dec 13 15:58:37 terminal.example.com dhclient[724]: bound to 10.1.143.188 -- renewal in 1552 seconds.

● cloud-init-network.service - Cloud-init: Network Stage
     Loaded: loaded (/usr/lib/systemd/system/cloud-init-network.service; disabled; preset: disabled)
     Active: active (exited) since Fri 2024-12-13 15:28:57 UTC; 30min ago
 Invocation: 47915b0cfbce44d7ae70bbdeb816f7cb
    Process: 629 ExecStart=sh -c echo "start" | netcat -Uu -W1 /run/cloud-init/share/network.sock -s /run/cloud-init/share/network-return.sock | sh (code=exited, status=0/SUCCESS)

cloud-init.tar.gz

dominikborkowski avatar Dec 13 '24 16:12 dominikborkowski

Late follow-up, with debugging assistance from @TheRealFalcon .

The issue appears to be in the package upgrade process, where the new, and necessary, cloud-init-main.service systemd service is not enabled. I filed a bug with Kali (https://bugs.kali.org/view.php?id=9035), and I'll see if I can further replicate it using Debian Testing.

dominikborkowski avatar Dec 17 '24 15:12 dominikborkowski

Late follow-up, with debugging assistance from @TheRealFalcon .

The issue appears to be in the package upgrade process, where the new, and necessary, cloud-init-main.service systemd service is not enabled. I filed a bug with Kali (https://bugs.kali.org/view.php?id=9035), and I'll see if I can further replicate it using Debian Testing.

@dominikborkowski I think I see the issue in debian. I just signed up for an account so that I can raise an issue on their Gitlab.

holmanb avatar Dec 17 '24 19:12 holmanb