bugs
bugs copied to clipboard
Ignition timeout when PXE booting with LACP connected to Cisco Nexus 5k.
Issue Report
Ignition fails to wait for network to become ready.
This is problematic for an operator with hosts connected to Cisco Nexus 5k ports configured for LACP as the ports must negotiate, even with fast timers and port fast enabled it takes around 70 seconds for the interfaces to receive an IP address.
The problem is that the server is not attempting to negotiate LACP until it can be configured to do so, thus you must wait for LACP to timeout.
Remove LACP and everything works as expected, kind of. The PXE boot guide leaves a lot to be desired, locksmith errors aside it works.
Bug
In less than 62 seconds, the fastest I've seen an IP address acquired in a bonded configuration, Ignition has already given up and reboots the system which is a painful 5 minute wait while troubleshooting this issue btw.
Container Linux Version
Container Linux 1911.4.0.
$ cat /etc/os-release
NAME="Container Linux by CoreOS"
...
BUG_REPORT_URL="https://issues.coreos.com"
Environment
What hardware/cloud provider/hypervisor is being used to run Container Linux? Baremetal, Supermicro server, with Intel nics.
Expected Behavior
If it would wait the 90 seconds which is supposedly the default, as I seen in another issue that I could not find again, there would be no issue or wait until the host has an IP address.
Actual Behavior
Complete craziness, knowing that it has no IP address and no network connectivity it repeatedly attempts to download a file which in fact requires a network connection and gives up very shortly before it is possible, like 5-10 seconds or so.
Reproduction Steps
- Connect two interfaces to a Cisco 5k with channel-group mode active configured. int eth1/1 lacp rate fast channel-group 100 mode active
interface port-channel100 no lacp graceful-convergence spanning-tree port type edge
You could trying messing with the switch config but no combination of things makes it work, lacp rate fast and no graceful-convergence with spanning tree set to edge buys you 10 seconds or so.
- Setup PXE server, ensure it works.
- Wait for said issue after attempting to boot.
Other Information
I also attempted to set the IP address in the kernel boot options but the system halted because "the Transport endpoint is not connected". Probably because of LACP negotiation but who knows. This may not be an issue in your "development" environment but in real world metal cases it is definitely a problem.
A possible fix would be to add a kernel option to specify interfaces that should be bonded.
Which would allow for proper LACP negotiation on startup and avoid the LACP timeout problem.
Some more details, I think that it thinks that the interface gained IPv6 connectivity when it did not.
[ 2.373696] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready It's around this time that the download first tries to take place.
I suppose a custom oem would probably also solve this, should have scrolled to the bottom before spending hours troubleshooting this.
Custom OEM is the way to go if you're having this issue as well. This should probably still be addressed and/or the documentation should be updated.
If it's fetching things over http, you can adjust the timeouts in the ignition.timeouts
section. Would that fix your problem or is it timing out fetching the initial config?
It is timing out fetching the initial config.
As an (ugly) workaround you could use a data url for the initial config (assuming you're using coreos.config.url
on the kernel command line) to just be a stub config that sets the timeouts and fetches the real config using the ignition.config.replace
section.
Ignition on CL depends on network.target
. What the network being up means is somewhat nebulous (see https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/). This is why we retry instead of trying to detect if network is up.
It's hard to determine what a "reasonable" timeout is. I'd agree that ~60s (what it currently is) is perhaps a little short. Do you think increasing it to ~120s would fix your problem?
90 seconds would be sufficient, 120 would definitely fix it.
It takes 30 seconds for LACP to timeout, spanning tree hello and forwarding takes around 17 seconds without port fast enabled and then the DHCP client has to do its thing.
I'm unsure but I believe spanning tree has to do it's cycle once more after the LACP timeout. Since the port falls back into an individual state, could be wrong though I haven't gone that far into it.
I'll time the Intel pxe boot agent to see what it uses, they probably have it figured out pretty good.
They are using ~73 seconds, probably 75.
With port-fast it I get an IP at ~62 seconds and without ~72 seconds, the Intel agent always works though so maybe 75 is the magic number.
it repeatedly attempts to download a file which in fact requires a network connection and gives up very shortly before it is possible, like 5-10 seconds or so.
@CyrusTheVirusG can you please attach the specific log entries showing this (and the whole boot logs if possible)? This doesn't match my expectations and I'm worried there may be something else going on.
@ajeddeloh which 60s timeout are you thinking about?
The is the journal file before tampering with the boot, basically I added nsswitch.conf/resolve.conf to resolve the name servers, because it initially failed with not finding a nameserver. I was completely unaware that the interfaces did not come up. So the "error" is not find the nameserver at this point, but still the root cause is the the interfaces are not up and the system gives up too early. Journal.txt attached.
also some additional topic, i mount also an ca certificate. that is the only thing i always mounted, as i need my self signed ca, in order to make ssl work. This used to work about 8 month ago.
Exact processt is: Create a /etc/sssl/certs/mycert.pem and put it into a "cert.cpio.gz", which gets mounted during boot. This used to work on my machine (TM) :)
Ah, nevermind, I was mixing the client HTTP total timeout (which is unlimited) with the config fetching timeout (which is one minute).
Err, its actually not one minute, it's 10.1 + 10.2 + 10.4 + 10.8 + 11.6 + 13.2 = 65.5
seconds (although the last request would start at t=52.3s). Each http attempt times out after 10 seconds then there is a backoff between attempts that exponentially increases. This really ought to be simplified since the exponential backoff is dwarfed by http timeout, but for now we could probably increase the max backoff to 30 sec or something similar.