cloud-init icon indicating copy to clipboard operation
cloud-init copied to clipboard

rename attempt of already renamed device

Open rjschwei opened this issue 2 years ago • 11 comments

Bug report

When cloud-init is instructed to rename an interface it should check if the name is already being used or not before trying to attempt to rename a network device.

We also have to decide what the behavior should be when the name is detected to exist.

Steps to reproduce the problem

There is an inherent race condition between cloud-init and udev that occurs when the system is being rebooted before cloud-init is completely finished. In a case where a network interface gets assigned a name by the user through the network configuration cloud-init will write this information into the udev rules file (persistent-net-cloud-init.rules) generated during the init-local phase. If after cloud-init local is complete the system gets rebooted and the same network config is provided by the user cloud-init will start again, as usual, but on this boot the udev rules file also exists and as such udev will run and process the rules in the previously generated persistent-net-cloud-init.rules file. While the file itself will get overwritten during the new run udev will have already read the old file and process it. At this point it is just a matter of time that udev will create the renamed device setup and by the time cloud-init comes around the rename will have already happened and cloud-init will generate an error:

""" Command: ['ip', 'link', 'set', 'enp55s1f0', 'name', 'ccd_ban'] Exit code: 2 Reason: - Stdout: Stderr: RTNETLINK answers: File exists """

As a result the interface remains down.

Environment details

  • Cloud-init version: 23.3
  • Operating System Distribution: SLES
  • Cloud provider, platform or installer type:

cloud-init logs

cloud-init starts at 2024-02-08 05:54:17,618 then the network devices are enumerated starting at 2024-02-08 05:54:17,790 and ending at 2024-02-08 05:54:17,829

Then network devices are enumerated again starting at 2024-02-08 05:54:17,838 and ending at 2024-02-08 05:54:17,843. This is pretty much at the end of the clod-init-local phase. After all this we write out /etc/udev/rules.d/85-persistent-net-cloud-init.rules

The udev rules are written at

2024-02-08 05:54:17,846

Then at 2024-02-08 05:55:00,230 cloud-init starts it's "init" phase with net device configuration starting again at 2024-02-08 05:55:00,276 and that ends at 2024-02-08 05:55:00,292

Then at 2024-02-08 05:55:03,848 the final phase of cloud-init starts and at 2024-02-08 05:56:08,100 cloud-init gets killed. -> Reboot is triggered

After the system reboot cloud-init starts again at 2024-02-08 05:58:36,710 and the network device enumeration starts at 2024-02-08 05:58:36,950 and ends at 2024-02-08 05:58:37,009 close to the end of the second local phase. THe second "init" phase starts at 2024-02-08 05:58:45,541 with the network device enumertaion in init starting at 2024-02-08 05:58:45,589, only now udev has caught up and during this enumeration cycle we see

2024-02-08 05:58:45,590 - util.py[DEBUG]: Reading from /sys/class/net/ccd_ban/address

rjschwei avatar Mar 19 '24 20:03 rjschwei

@rjschwei , can you attach the results of cloud-init collect-logs to this bug?

TheRealFalcon avatar Mar 20 '24 13:03 TheRealFalcon

cloud-init.tar.gz

rjschwei avatar Mar 20 '24 17:03 rjschwei

It may be irrelevant due to the timestamp of where the script is called is after cloud-init's failure to rename a device that already exists on the most recent boot, but what's also a bit confusing here to me is the journalctl entry Feb 08 06:59:06.520604 pcsstag01-mn002 cloud-init[5205]: + python3 /node-files/bond_workaround.py /etc/udev/rules.d/85-persistent-net-cloud-init.rules.

Given that our captured journalctl logs only represent the most recent boot, I can't see whether that script was run on the previous boot.

Do we know what bond_workaround.py does and can we also confirm the contents of /etc/udev/rules.d/85-persistent-net-cloud-init.rules remain unchanged by this bond_workaround.py script? I want to make sure we aren't also interacting with another script that's manipulating cloud-init's udev rules. I'd expect given the network config the contents of 85-persistent-net-cloud-init.rules to be something like the following even after this bond_workaround.py script is called:

SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="72:58:28:f0:03:55", NAME="enp55s1f0"

blackboxsw avatar Mar 20 '24 18:03 blackboxsw

The second variable at play here seems to also be the network-manager-fix-ovs-interfaces.py script triggered by the systemd unit with the Description=Service that will assign mac address for the sriov vf as it looks to also be triggering interface up/down events that could exacerbate whatever race we think we are hitting

Feb 08 06:58:45.144226 pcsstag01-mn002 network-manager-fix-ovs-interfaces.py[3744]: Checking for ovs-interfaces which are in error state
Feb 08 06:58:45.144226 pcsstag01-mn002 network-manager-fix-ovs-interfaces.py[3744]: Chekcking if  the NM startup complete
Feb 08 06:58:45.144226 pcsstag01-mn002 network-manager-fix-ovs-interfaces.py[3744]: cmd = nmcli connection | grep 'ovs-interface\|ethernet' | grep -e "--" | awk '{print $1}'
Feb 08 06:58:45.144226 pcsstag01-mn002 network-manager-fix-ovs-interfaces.py[3744]: faulty device =
Feb 08 06:58:45.144226 pcsstag01-mn002 network-manager-fix-ovs-interfaces.py[3744]: All interfaces are in connected state, nothing to fix
Feb 08 06:58:45.145008 pcsstag01-mn002 bash[4187]: No VFs configured for device ens3f1
Feb 08 06:58:45.146377 pcsstag01-mn002 bash[4187]: No VFs configured for device ens3f4
Feb 08 06:58:45.147677 pcsstag01-mn002 bash[4187]: No VFs configured for device ens3f5
Feb 08 06:58:45.148980 pcsstag01-mn002 bash[4187]: No VFs configured for device ens3f6
Feb 08 06:58:45.150311 pcsstag01-mn002 bash[4187]: No VFs configured for device ens3f7
Feb 08 06:58:45.152964 pcsstag01-mn002 systemd[1]: Starting Initial cloud-init job (metadata service crawler)...
Feb 08 06:58:45.172659 pcsstag01-mn002 systemd[1]: network_manager_fix_ovs_interfaces.service: Deactivated successfully.
Feb 08 06:58:45.172848 pcsstag01-mn002 systemd[1]: Finished Check the ovs-interfaces that are in connecting state and trigger connection up to bring up the interfaces.
Feb 08 06:58:45.188536 pcsstag01-mn002 systemd[1]: assign_vf_mac_address.service: Deactivated successfully.
Feb 08 06:58:45.188713 pcsstag01-mn002 systemd[1]: Finished Service that will assign mac address for the sriov vf.
Feb 08 06:58:45.708085 pcsstag01-mn002 kernel: [qede_unload:3901(enp55s1f0)]Link is down

blackboxsw avatar Mar 20 '24 19:03 blackboxsw

Do we know what bond_workaround.py does and can we also confirm the contents of /etc/udev/rules.d/85-persistent-net-cloud-init.rules remain unchanged by this bond_workaround.py script?

I am still waiting for the content of the udev rules and now have also requested the content of the workaround script.

SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="72:58:28:f0:03:55", NAME="enp55s1f0"

Yes that is what I expect in the udev rules as well.

Thanks for picking up on the workaround script.

rjschwei avatar Mar 20 '24 20:03 rjschwei

Have the workaround script:

#!/usr/bin/python3

import re, sys, shutil

def main(udev_rule_file):
  mac_list = []
  with open(udev_rule_file, "r") as readstream, open(udev_rule_file+'.cleaned', "w") as writestream:
    for line in readstream:
      mac = get_mac_address(line)
      if mac not in mac_list:
        mac_list.append(mac)
        writestream.write(line)
 
  shutil.move(udev_rule_file+'.cleaned', udev_rule_file)

 def get_mac_address(line):
  elements = line.split(",")
  for x in elements:
    if "address" in x :
      x = re.sub('[\"\{\}\*]', '', x)
      return x.split("==")[1]

 if __name__ == '__main__':
    main(sys.argv[1])

I don't recall seeing a bug report that we echoed devices with duplicate MAC addresses to the generated udev files. But in any event this would not do any harm for in this case.

rjschwei avatar Mar 21 '24 12:03 rjschwei

Confirmed the generated udev rule file contains:

SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="72:58:28:f0:03:55", NAME="enp55s1f0"

rjschwei avatar Mar 25 '24 12:03 rjschwei

@TheRealFalcon @blackboxsw after creating #5086 and retesting I have a little better orientation around the code flow, but am still missing all the bits to put the puzzle together. From the test wit the patch I have extracted the following form the log:

2024-03-25 16:51:39,910 - stages.py[INFO]: Applying network configuration from fallback bringup=False:

2024-03-25 16:51:39,919 - sysconfig.py[INFO]: Generated udev rules /etc/udev/rules.d/85-persistent-net-cloud-init.rules already exist, skippingfurther network setup processing.

2024-03-25 16:51:48,443 - main.py[DEBUG]: [net] init will now be targeting instance id: b13c1cd7-8bec-4e65-9fa8-6cfe71987c76. new=False

2024-03-25 16:51:48,480 - stages.py[DEBUG]: No network config applied. Neither a new instance nor datasource network update allowed

2024-03-25 16:51:48,502 - __init__.py[DEBUG]: achieving renaming of [['72:58:28:f0:03:55', 'ccd_ban', None, None]] with ops

The

No network config applied. Neither a new instance nor datasource network update allowed

was also in the previous log and I missed it, meaning #5086 is in fact useless. That said the mystery is around https://github.com/canonical/cloud-init/blob/main/cloudinit/stages.py#L1040. We have decided that no network configuration should be applied, the message at 2024-03-25 16:51:48,480, and then we rename devices and return from apply_network_config() which to the best I can reconstruct at this part of the process was called from https://github.com/canonical/cloud-init/blob/main/cloudinit/cmd/main.py#L435 and this is where I loose track of the code flow. After line 435 in mai.py I do not understand how we end up in cloudinit/net/__init__.py where another rename attempt is made., i.e. the message at 2024-03-25 16:51:48,502 .

One would think that after https://github.com/canonical/cloud-init/blob/main/cloudinit/stages.py#L1040 we would not try another rename attempt. Butsince I cannot put the pieces together I cannot quite figure out how to stop it.

Help is appreciated.

rjschwei avatar Mar 29 '24 17:03 rjschwei

We have decided that no network configuration should be applied, the message at 2024-03-25 16:51:48,480, and then we rename devices and return from apply_network_config()

That final log is part of the rename devices you mention. As you say, update_event_enabled() returns False, which leads us to the no network config log. Calling into the next line, we should see another log that I'm not seeing in your paste. Did you strip it out?

After that we call this, which should lead to this, which should lead to this, which calls _rename_interfaces(), which leads to your line in question.

Does that help?

TheRealFalcon avatar Apr 08 '24 02:04 TheRealFalcon

The applying net config names message exists and yes I missde that in the snipped in the previous comment

2024-03-25 16:51:48,480 - stages.py[DEBUG]: applying net config names for {'ethernets': {'ccd_ban': {'dhcp4': True, 'set-name': 'ccd_ban', 'match': {'macaddress': '72:58:28:f0:03:55'}}}, 'version': 2}

After this the enumeration of network devices follows. Which starts with

2024-03-25 16:51:48,480 - util.py[DEBUG]: Reading from /sys/class/net/ccd_ban/device/device (quiet=False)

So we are reading from a path that we are later trying to create and that's where the problem occurs, but I still do not understand why.

I have created a brute force avoidance approach

--- cloudinit/net/__init__.py.orig
+++ cloudinit/net/__init__.py
@@ -7,6 +7,7 @@
 
 import errno
 import functools
+import glob
 import ipaddress
 import logging
 import os
@@ -698,6 +699,12 @@ def _rename_interfaces(
     renames, strict_present=True, strict_busy=True, current_info=None
 ):
 
+    net_devs = glob.glob('/sys/class/net/*')
+    for dev in net_devs:
+        name = dev.split('/')[-1]
+        if name in renames:
+            del renames[renames.index(name)]
+
     if not len(renames):
         LOG.debug("no interfaces to rename")
         return

But this appears like a big hammer to plaster over the problem that should not be needed.

I also noticed

2024-03-25 16:51:48,493 - __init__.py[DEBUG]: Detected ccd_int as an OVS interface

This means when get_interfaces() is called filter_openvswitch_internal argument is True and maybe it shouldn't be?

But I do not know what the criteria shold as to when this should be True or False

rjschwei avatar Apr 08 '24 11:04 rjschwei

And the fixed brute force approach ;)

+++ cloudinit/net/__init__.py
@@ -7,6 +7,7 @@
 
 import errno
 import functools
+import glob
 import ipaddress
 import logging
 import os
@@ -698,6 +699,12 @@ def _rename_interfaces(
     renames, strict_present=True, strict_busy=True, current_info=None
 ):
 
+    net_devs = glob.glob('/sys/class/net/*')
+    for dev in net_devs:
+        dev_name = dev.split('/')[-1]
+        for dev_data in renames:
+            if dev_name in dev_data:
+                del renames[renames.index(dev_data)]
+
     if not len(renames):
         LOG.debug("no interfaces to rename")
         return

rjschwei avatar Jun 03 '24 19:06 rjschwei