dracut
dracut copied to clipboard
nm-run.sh should check interfaces repeatedly until they are up
Currently, nm-wait-online-initrd.service
only waits for NetworkManager to activate all devices; it doesn't wait for NetworkManager to reach a connected state. Because the settled
hook nm-run.sh
only provides one opportunity for the connection to be up, when DHCP loses to nm-run.sh
, the online
hooks are not run and the boot fails.
Here's an example of the journal illustrating the issue. I've patched warn "$(nmcli)" ; warn "$(cat /run/NetworkManager/system-connections/default_connection.nmconnection))"
into nm-run.sh
to show that DHCP loses out to nm-run.sh
:
-
nm-wait-online-initrd.service
executesnm-online -s
(line 5967). -
nm-online -s
waits for NetworkManager client startup, skipping theCONNECTED_*
state check. - The device activates, so NetworkManager reports
startup complete
(line 15681) andnm-online -s
returns; but DHCP isn't done yet (in fact it hasn't actually started). -
dracut-initqueue
begins and runs thenm-run.sh
settled
hook (line 17173). -
nm-run.sh
sees that the device has no UUID yet, so skips executing theonline
hooks for the device (line 17278). (NB settingmay-fail=false
, line 17219, doesn't affect-s
.) -
nm-run.sh
writes/tmp/nm.done
(line 17287). - DHCP finally finishes (line 20510, interspersed NetworkManager logs).
- On every subsequent run,
nm-run.sh
sees/tmp/nm.done
and exits without running theonline
hooks for the now online device (line 20628).
Changes
There are several possible solutions:
- (This PR) Remove the
-s
option from thenm-online
invocation innm-wait-online-initrd.service
, so thatnm-online
waits forCONNECTED_*
, including DHCP being done. Here's an example of the journal with this PR in place, showing thenm-run.sh
waiting for DHCP and successfully running theonline
hooks. - Make
nm-run.sh
not write/tmp/nm.done
unless it has confirmed NetworkManager's state isCONNECTED_*
- Make
nm-run.sh
keep track of which devices it hasn't yet run theonline
hooks for, and retry them on the next run
Checklist
- [x] I have tested it locally
- [ ] I have reviewed and updated any documentation if relevant
- [ ] I am providing new code and test(s) for it
Fixes #
JFYI, previous discussion here: #1682
Thanks, @aafeijoo-suse -- I hadn't noticed the duplicate PR. My apologies.
My experience bears out the first sentence, but not the second, of what @bengal has written:
That's not correct. -s waits that NM has attempted to activate every connection. A connection attempt ends either when the connection is successful (i.e. it gets an address if it needs to) or when it fails (for DHCP, after the timeout specified by rd.net.timeout.dhcp=).
The debug logs in the gists I linked above show that NM "activates" the eno1 connection once its device is active, so reports "startup complete", whereupon nm-online -s
exits. However, the connection attempt clearly hasn't started DHCP. It is without -s
that will make nm-online
wait for the CONNECTED_*
state, namely that DHCP has finished, before exiting.
I don't have multiple connections to play with, but my quick read of nm-online
's source code seems to suggest that -s
, viz. "startup complete", tests whether every autoconnect connection has been activated (or at least attempted); and the absence of -s
tests whether any connection has become connected, putting NM into a CONNECTED_*
state. I don't know how waiting for the CONNECTED_*
state impinges on other dracut modules, but as far as I can tell, nm-run.sh
definitely needs it; otherwise it will skip all the online hooks and the boot will fail.
The problem in the log above is that eno1 doesn't have carrier. NM has a built-in carrier-timeout of 6 seconds, after which it gives up waiting for the device. Since the default value is probably too short (the network-legacy module had 10 seconds timeout), I have a patch in the works to increase the timeout to 10 or 15 seconds: https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1239
In the log I see:
Jul 01 16:17:53 fedora NetworkManager[904]: <debug> [1656692273.7863] device[479c44c90ebf5ecc] (eno1): add_pending_action (3): 'carrier-wait'
...
Jul 01 16:17:59 fedora NetworkManager[904]: <debug> [1656692279.7927] device[479c44c90ebf5ecc] (eno1): remove_pending_action (0): 'carrier-wait'
Jul 01 16:17:59 fedora NetworkManager[904]: <trace> [1656692279.7929] settings: startup-complete: ready, no more profiles to wait for
...
Jul 01 16:18:00 fedora kernel: e1000e 0000:00:1f.6 eno1: NIC Link is Up 100 Mbps Full Duplex, Flow Control: Rx/Tx
So, eno1 gets carrier after 7 seconds.
Until the value is bumped in NM, you can supply kernel argument rd.net.timeout.carrier=<seconds>
to increase it to an arbitrary value.
I don't have multiple connections to play with, but my quick read of
nm-online
's source code seems to suggest that-s
, viz. "startup complete", tests whether every autoconnect connection has been activated (or at least attempted); and the absence of-s
tests whether any connection has become connected
Correct. In the initrd case we want to wait for every generated connection to finish, hence the -s
. The difference is that the startup-complete event used by -s
has a timeout on carrier, while waiting for the connected state doesn't.
Awesome, rd.net.timeout.carrier
solves my problem just as well. Thank you @bengal.
However, from reading the man page for dracut.modules
initqueue/online, I'm still confused why, without rd.net.timeout.carrier
, nm-run.sh
didn't eventually execute the online hooks for eno1:
This hook (initqueue/online) gets executed whenever a network interface comes online (that is, once it is up and configured by the configured network module).
Is that supposed to happen with NM? The logs I shared above show that nm-run.sh
only checks each connection once for triggering the online hooks, then never checks again even when eno1 did come online; so the online hooks will appear to not obey the semantics described in this man page description. Is nm-run.sh
intended to behave this way?
(This discrepancy in the behaviour of nm-run.sh
is obscured if nm-online -s
does obtain all the connections as you described, but if the carrier-wait elapses, this behaviour will surface as seen in the debug logs.)
nm-run.sh
is executed when NM reaches the startup-complete phase, and is executed only once. It checks which interfaces have an active connection (by looking at the connection-uuid
in /run/NetworkManager/devices/$ifindex
), and invokes the online and netroot hooks on them.
If the interface doesn't have carrier before the timeout, startup-complete is reached early when no connection is active on the interface. Therefore, hooks are not called.
Is nm-run.sh intended to behave this way?
Ideally, the carrier timeout should be long enough, and https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1239 will help with that.
If it's not long enough, then network-online
is reached too early and potentially the hooks won't be called. Probably the implementation of the module could be improved to avoid this, but I think the best solution to this is to have a timeout that is long enough.
Yep, I follow your reasoning. And your PR for the lengthened timeout in NM would work for me. But since you've said:
If it's not long enough, then network-online is reached too early and potentially the hooks won't be called. Probably the implementation of the module could be improved to avoid this,
My later suggestion is exactly this improvement you've described. If you're willing to allow nm-run.sh
to run through multiple times, checking for newly active connections each time, then the hooks will always eventually be called regardless of the NM timeout, which (1) completely eliminates the race condition in nm-online -s
and obviates the question of how long the timeout should be and (2) complies with the description in the initqueue/online docs.
Do you know of a reason this suggestion would be a bad idea? If not, I can fiddle around with it and come up with a POC.
Do you know of a reason this suggestion would be a bad idea? If not, I can fiddle around with it and come up with a POC.
I think it's not actually a bad idea; I'm mostly concerned about the complexity that it will introduce, but maybe it can be done easily.
Currently nm-run.sh
is executed in the initqueue, and nm-wait-online-initrd.service
is ordered before dracut-initqueue.service
so that the initqueue runs after NetworkManager reaches startup-complete (i.e. when also nm-online -s
returns).
nm-run.sh
writes /tmp/nm.done
to ensure it's run only once. If you remove the check then the script will potentially run multiple times, until it's removed from the initqueue and as long as there are other hooks in the hooks/finished
directory keeping the initqueue looping. I wonder if we need to ensure that /sbin/netroot
and initqueue/online
hooks are run only once for each interface, or it's okay to run them multiple times. In the latter case, it would be simpler.
Hi @bengal , I've taken a shot at updating nm-run.sh
so that it keeps track of which NM-managed interfaces it hasn't yet run the online hooks for. This way, each interface only has the online hook run once. Only once the online hooks are run for all the interfaces does /tmp/nm.done
get written. This way, there's no dependence on interfaces coming up before the carrier timeout elapses.
Here's a log showing how it works on my machine.
Let me know of any corner cases I haven't accounted for. I'm not sure if there's a better way than nmcli
to figure out if an interface is not NM-managed.
Hi, thanks for the patch. The problem I see is the following: when there is an interface that doesn't have any matching connection profile generated (but the interface is still managed), ct
will be set to 1, preventing the write of /tmp/nm.done
. This will also prevent the initqueue to finish, due to this line in nm_generate_connections()
:
echo '[ -f /tmp/nm.done ]' > "$hookdir"/initqueue/finished/nm.sh
Like, suppose you have ip=enp1s0:dhcp rd.neednet=1
in the command line, and there are enp1s0
, enp2s0
interfaces; will this work? Or the boot will be stuck forever?
Hmm, good point.
Would enp2s0
's connection be inhibited by initrd-no-auto-default.conf
?
If not, we'll have to add some marker to distinguish those connection keyfiles in /run/NetworkManager/system-connections
that were generated by nm-initrd-generator
. Or maybe, now that this PR is going to create a /tmp/nm.done.d
to keep track of the done interfaces, we can make initqueue/finished/nm.sh
check each individual interface's file in /tmp/nm.done.d
, on top of /tmp/nm.done
. Which would you prefer?
Would enp2s0's connection be inhibited by initrd-no-auto-default.conf?
initrd-no-auto-default.conf
prevents the automatic creation of a "Wired Connection 1" profile when no other matching profile exists for the interface. It doesnt' help in this scenario.
If not, we'll have to add some marker to distinguish those connection keyfiles in /run/NetworkManager/system-connections that were generated by nm-initrd-generator. Or maybe, now that this PR is going to create a /tmp/nm.done.d to keep track of the done interfaces, we can make initqueue/finished/nm.sh check each individual interface's file in /tmp/nm.done.d, on top of /tmp/nm.done. Which would you prefer?
The problem is this: we need to check which devices are still waiting for a potential activation; or, in other words, which devices have a matching connection profile but are not yet activated (for example because carrier is missing).
At the same time, we must not consider devices without a matching profile (as enp2s0 in my example above). For those, it would be wrong to wait.
But deciding if the device is in the first group or the second is not trivial. The generator could create a file containing a list of devices for which it generated a profile. The file would contain:
-
the interface name, for arguments like
ip=enp1s0:dhcp
-
the MAC address when the interface is specified via a MAC:
ip=00-11-22-33-44-55:dhcp
. The generator can't look up the matching interface name because at the time it runs, interfaces are not necessarily discovered yet. -
in case of wildcard profiles like
ip=dhcp
, it would store something like:*
.
Then, nm-run.sh would need some more logic to parse the file and decide if all devices specified there are activated. This seems quite fragile and requires an additional layer of code, so I would avoid this.
A possible alternative would be that instead of calling hooks from nm-run.sh (only once when NM signals network-online), hooks are called by NM via dispatcher scripts for the UP event of interfaces. But I'm not sure if that would work, because it means that the hooks are called from a different context, and maybe environment variable are missing.
The problem is this: we need to check which devices are still waiting for a potential activation; or, in other words, which devices have a matching connection profile but are not yet activated (for example because carrier is missing). At the same time, we must not consider devices without a matching profile (as enp2s0 in my example above). For those, it would be wrong to wait.
I notice that nm_generate_connections
, called by the nm-config.sh
cmdline hook, does rm -f /run/NetworkManager/system-connections/*
, then runs nm-initrd-generator
to create the connection keyfiles (which contains the UUID of the connection). Since nm-initrd.service
is After=dracut-cmdline.service
, would it be correct to say that at the point when nm-initrd-generator
completes, /run/NetworkManager/system-connections
should be for sure a clean set of the devices in the first group you mentioned?
If so, what do you think of:
- making
nm-run.sh
write/tmp/nm-done.d/<uuid of the completed connection>
- making the loop towards the end of
nm_generate_connections
iterate over all of the connection keyfiles in/run/NetworkManager/system-connections/*
, etc., and havingfinished/nm.sh
watch for all of the/tmp/nm-done.d/<uuid of the expected connection>
?
Since nm-initrd.service is After=dracut-cmdline.service, would it be correct to say that at the point when nm-initrd-generator completes, /run/NetworkManager/system-connections should be for sure a clean set of the devices in the first group you mentioned?
Not exactly. /run/NetworkManager/system-connections
contains a list of connections. Each connection can be activated on 0,1, or more devices. When the generator runs, it doesn't know which devices are available, and it generates connections only based on the kernel command line (because at the time the generator runs there is no guarantee that all devices were discovered by kernel).
I see, that makes sense. Thanks for explaining. Is the below summary of what you've said about the desired semantics of nm-run.sh
accurate?
- Once a connection becomes active, the online hooks should be run on all its devices, if any
-
nm-run.sh
should wait until all connections generated bynm_generate_connections
are active, and should not wait for other connections (although the online hooks can be run for those connections' devices)
If so, how about yet another approach:
-
nm_generate_connections
saves off the UUIDs of the connections generated bynm-initrd-generator
. This freezes the list of connections, and should take care of distinguishing which connections we should and shouldn't wait for - Then
nm-run.sh
later comes along and iterates through all the NM connections, grabbing those connections' corresponding device list fromnmcli --get-values GENERAL.DEVICES connection show <connection UUID>
. If a connection has no, or more than one, device, we can handle it by looping through its device list. If it meets a connection that isn't on the frozen list of connections, then it shouldn't inhibit writingnm-done.sh
@bengal where are we at with this one?
If it helps, I updated this PR with an implementation of the approach I outlined in my previous comment.
I haven't got around to testing the multiple-connection scenario or the scenarios where there are multiple, or no, devices to one connection. Do you have any suggestions for a kernel argument string that would create the latter scenario?
Then nm-run.sh later comes along and iterates through all the NM connections, grabbing those connections' corresponding device list from nmcli --get-values GENERAL.DEVICES connection show <connection UUID>. If a connection has no, or more than one, device, we can handle it by looping through its device list. If it meets a connection that isn't on the frozen list of connections, then it shouldn't inhibit writing nm-done.sh
I don't think this would work because GENERAL.DEVICES returns the device only the connection started activating on a device. If the device is waiting for carrier, then the connection is not activating and the command returns an empty string.
The problem we need to solve is: we want to call the dracut hooks also for devices that become ready (e.g. appear or get carrier) after NM reached the 'online' target.
On a second thought, it seems to me that the problem can't be solved by delaying the boot until ALL devices are ready because:
-
in case a device never gets carrier, the boot will be delayed forever (or until other longer dracut timeouts kick in), while we want to respect the
rd.net.timeout.carrier=
parameter -
the script can't predict which devices will appear later
--
I think it would be better to solve the problem in a different way: instead of having nm-run.sh
call source_hook initqueue/online
and /sbin/netroot
for all known devices at the same time, we could use the dispatcher script feature of NM to call a script every time an interface goes up in NM. This script would call those two dracut hooks for the interface that was just activated.
In this way, even if a interface goes up after NM reached the online target, the hooks will be called.
However, I'm not sure those hooks can be called directly from an external process; probably they must be called from within dracut initqueue (this is a question for @johannbg). In such case, we need to use a indirect way like: the dispatcher script doesn't call directly the hooks, but instead drops a script inside one of the initqueue directories, that get processed from the initqueue.
This issue is being marked as stale because it has not had any recent activity. It will be closed if no further activity occurs. If this is still an issue in the latest release of Dracut and you would like to keep it open please comment on this issue within the next 7 days. Thank you for your contributions.
Wait, stale-bot, don't close yet! I'm back! Sorry for the delay.
On a second thought, it seems to me that the problem can't be solved by delaying the boot until ALL devices are ready because: in case a device never gets carrier, the boot will be delayed forever (or until other longer dracut timeouts kick in), while we want to respect the rd.net.timeout.carrier= parameter
@bengal, I'm afraid I'm not following what you mean here .... I get that we want to delay boot not for all devices, but only those from nm_generate_connections
; this is what you pointed out previously in this thread). I do think we're on the same page about this -- my proposed logic has it such that only those connections from nm_generate_connections
(what I called "frozen connections") inhibit boot progress, and any other connections don't inhibit boot progress. (Modulo the bit about GENERAL.DEVICES
, which I will need to study up on.)
So now I don't think I understand fully how rd.net.timeout.carrier
plays a role in nm-run.sh
. I thought this parameter only regulates the duration of nm-online -s
, which basically dictates how long it takes before nm-run.sh
is called for the first time to run hooks on devices that are already online. What am I missing?
How about a concrete example, so you can easily point out where my logic goes off track:
- The kernel arguments are
ip=enp1s0:dhcp ip=enp2s0:dhcp ip=enp3s0:dhcp
- The machine also has
enp4s0
andenp5s0
, but they aren't in the kernel arguments, sonm_generate_connections
doesn't generate a connection for them -
nm-online -s
waits forenp1s0
,enp2s0
, andenp3s0
.enp1s0
is fast and comes up beforerd.net.timeout.carrier
elapses; butenp2s0
andenp3s0
are not yet done by then, sonm-online -s
exits -
nm-run.sh
starts. It findsenp1s0
is associated with an active NM connection, and runs the dracut hooks on this device - Some time later,
enp2s0
's connection activates too, andnm-run.sh
runs the dracut hooks on this device - Then
enp4s0
's connection activates too, andnm-run.sh
runs the dracut hooks on this device too (even though it wasn't in the kernel arguments; this is to comply with the man page for the online hook) -
nm-run.sh
keeps waiting forenp3s0
until the dracut initqueue timeout elapses:- If
enp3s0
's connection activates before the dracut initqueue timeout elapses, thennm-run.sh
runs the dracut hooks onenp3s0
, and then reports done, and then the boot continues - Otherwise, the boot fails (because we didn't get all the devices we specified in the kernel arguments)
- If
-
nm-run.sh
doesn't wait forenp5s0
since it wasn't in the kernel arguments
This issue is being marked as stale because it has not had any recent activity. It will be closed if no further activity occurs. If this is still an issue in the latest release of Dracut and you would like to keep it open please comment on this issue within the next 7 days. Thank you for your contributions.