bugs icon indicating copy to clipboard operation
bugs copied to clipboard

installation (ignition) has a race with systemd-networkd , starts to early

Open DirkTheDaring opened this issue 5 years ago • 3 comments

Issue Report

cannot load ignition configuration from a matchbox service due to a race of ignition and network devices You can see the race in the log, in the following dump. This is a setup using some slower older machines, within a VMware

Bug

Dec 09 13:52:54 localhost kernel: audit: type=1130 audit(1544363573.877:7): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=systemd-networkd comm="systemd" exe="/usr/lib64/systemd/systemd" hostname=? addr=? terminal=? res=success'
Dec 09 13:52:53 localhost audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=systemd-networkd comm="systemd" exe="/usr/lib64/systemd/systemd" hostname=? addr=? terminal=? res=success'
Dec 09 13:52:53 localhost systemd-networkd[230]: Enumeration completed
Dec 09 13:52:54 localhost ignition[266]: GET error: Get https://matchbox.fritz.box/ignition?uuid=efbb4d56-4198-4819-ba75-8e059437150f&mac=00-0c-29-37-15-0f&spin=worker1&channel=stable&arch=amd64-usr&flavor=kubespray&domain=fritz.box&stage=boot: dial tcp: lookup matchbox.fritz.box on 192.168.178.250:53: dial udp 192.168.178.250:53: connect: network is unreachable
Dec 09 13:52:54 localhost ignition[266]: GET error: Get https://matchbox.fritz.box/ignition?uuid=efbb4d56-4198-4819-ba75-8e059437150f&mac=00-0c-29-37-15-0f&spin=worker1&channel=stable&arch=amd64-usr&flavor=kubespray&domain=fritz.box&stage=boot: dial tcp: lookup matchbox.fritz.box on 192.168.178.250:53: dial udp 192.168.178.250:53: connect: network is unreachable
Dec 09 13:52:54 localhost ignition[266]: GET error: Get https://matchbox.fritz.box/ignition?uuid=efbb4d56-4198-4819-ba75-8e059437150f&mac=00-0c-29-37-15-0f&spin=worker1&channel=stable&arch=amd64-usr&flavor=kubespray&domain=fritz.box&stage=boot: dial tcp: lookup matchbox.fritz.box on 192.168.178.250:53: dial udp 192.168.178.250:53: connect: network is unreachable
Dec 09 13:52:54 localhost systemd-networkd[230]: lo: Configured
Dec 09 13:52:55 localhost systemd-networkd[230]: eth0: IPv6 successfully enabled
Dec 09 13:52:55 localhost ignition[266]: GET error: Get https://matchbox.fritz.box/ignition?uuid=efbb4d56-4198-4819-ba75-8e059437150f&mac=00-0c-29-37-15-0f&spin=worker1&channel=stable&arch=amd64-usr&flavor=kubespray&domain=fritz.box&stage=boot: dial tcp: lookup matchbox.fritz.box on 192.168.178.250:53: dial udp 192.168.178.250:53: connect: network is unreachable
Dec 09 13:52:57 localhost ignition[266]: GET error: Get https://matchbox.fritz.box/ignition?uuid=efbb4d56-4198-4819-ba75-8e059437150f&mac=00-0c-29-37-15-0f&spin=worker1&channel=stable&arch=amd64-usr&flavor=kubespray&domain=fritz.box&stage=boot: dial tcp: lookup matchbox.fritz.box on 192.168.178.250:53: dial udp 192.168.178.250:53: connect: network is unreachable
Dec 09 13:52:57 localhost systemd-networkd[230]: eth0: Gained carrier
Dec 09 13:52:57 localhost systemd-networkd[230]: eth0: DHCPv4 address 192.168.178.49/24 via 192.168.178.1
Dec 09 13:52:58 localhost systemd-networkd[230]: eth0: Gained IPv6LL
Dec 09 13:53:10 localhost systemd-networkd[230]: eth0: Configured
Dec 09 13:53:54 localhost systemd-networkd[230]: lo: Lost carrier
Dec 09 13:53:54 localhost systemd-networkd[230]: eth0: Lost carrier

Container Linux Version

VERSION=1911.4.0

Environment

What hardware/cloud provider/hypervisor is being used to run Container Linux? VMWare 2GB IPV4 network

Expected Behavior

Load coreos.config.url as before and install machine

Actual Behavior

Fails, because the network devices are some seconds later up, so that ignition which is trying to load from url fails

Reproduction Steps

  1. . Get a 2 GB VMware, try to pxe boot coreos with a coreos.config.url pointing to a service
  2. ...

Other Information

DirkTheDaring avatar Dec 09 '18 14:12 DirkTheDaring

There is a suggested workaround in #2527 for config timeouts.

dm0- avatar Dec 09 '18 15:12 dm0-

This "workaround" left me with more confusion . Is there an example or step by step approach, how this workaround can be achieved ?

DirkTheDaring avatar Dec 09 '18 15:12 DirkTheDaring

@DirkTheDaring can you please attach subsequent logs entries (i.e. for the whole boot)? In the default configuration, Ignition will keep retrying until it is able to fetch the userdata. Racing with network stabilization is expected, but a boot failure is likely due to some other problems.

lucab avatar Dec 09 '18 21:12 lucab