crowbar-core icon indicating copy to clipboard operation
crowbar-core copied to clipboard

Improve how we sort interfaces when there's no interface map

Open vuntz opened this issue 9 years ago • 9 comments

When there's no interface map, we sort interfaces by their name. Which is kind of okay-ish, except that on first boot, the names are assigned in a first-come first-serve basis. Which might not match the expectation.

So instead, when there's no interface map, sort by bus order and fallback on the name.

cc @rhafer

Maybe something for the next version, though?

vuntz avatar Feb 23 '16 23:02 vuntz

~~personally I'd prefer sorting by the persistent device name, as thats the only slightly predictable name. bus order is weird.~~

I"m afraid that we need a cloud 6 solution, since this is a pretty bad regression from the sle12 upgrade :(

dirkmueller avatar Feb 24 '16 08:02 dirkmueller

I"m afraid that we need a cloud 6 solution, since this is a pretty bad regression from the sle12 upgrade :(

Bad regression from which sle12 upgrade? Can you please clarify on that? I am not sure what you are referring to.

rhafer avatar Feb 24 '16 08:02 rhafer

@vuntz Is there a bugreport for this? I'd like to understand the problem you are trying to fix here.

As of now I think the suggested fix has a high potential to break network on systems upgrading from cloud 5. Because the ordering might be changing for them when they transition back to ready.

rhafer avatar Feb 24 '16 09:02 rhafer

@dirkmueller BTW, what do you mean by persistent device name. Is that referring to systemd's predictable names (e.g. enp0s25) ? Or do you mean the names created by biosdevname (em1, p3p4)

rhafer avatar Feb 24 '16 09:02 rhafer

personally I'd prefer sorting by the persistent device name, as thats the only slightly predictable name. bus order is weird.

But sorting by bus order is actually what I would expect: crowbar is sorting by bus order already when there's an interface map; and the natural thing to do is to use the normal bus order when no interface map is provided.

I"m afraid that we need a cloud 6 solution, since this is a pretty bad regression from the sle12 upgrade :(

Bad regression from which sle12 upgrade? Can you please clarify on that? I am not sure what you are referring to.

It's actually not a complete regression from sle12 since it could also happen on sle11 with interfaces having different names (or were all interfaces named ethX in sle11?).

But the issue in sle12 is that interface names are simply assigned in no specific order (while it was somehow predictable in sle11, it seems), so it makes this issue more visible.

vuntz avatar Feb 24 '16 09:02 vuntz

@vuntz Is there a bugreport for this? I'd like to understand the problem you are trying to fix here.

On some hardware from Alejandro, I had to specify an interface map to get the expected order. But the interface map I provided was sorted:

        {
          "bus_order": [
            "0000:00/0000:00:03.0/0000:01:00.0",
            "0000:00/0000:00:03.0/0000:01:00.1",
            "0000:00/0000:00:1c.4/0000:06:00.0",
            "0000:00/0000:00:1c.4/0000:06:00.1"
          ],
          "pattern": "PowerEdge R730xd"
        },

Without this, crowbar was actually using this order:

            "0000:00/0000:00:1c.4/0000:06:00.0",
            "0000:00/0000:00:1c.4/0000:06:00.1",
            "0000:00/0000:00:03.0/0000:01:00.0",
            "0000:00/0000:00:03.0/0000:01:00.1"

and that's only because the interface names were:

eth0    =>  "0000:00/0000:00:1c.4/0000:06:00.0"
rename3 =>  "0000:00/0000:00:1c.4/0000:06:00.1"
rename4 =>  "0000:00/0000:00:03.0/0000:01:00.0"
rename5 =>  "0000:00/0000:00:03.0/0000:01:00.1"

and with no interface map, we sort by interface name.

vuntz avatar Feb 24 '16 10:02 vuntz

@rhafer I probably meant the systemd persistent device names (I didn't realize that those are two different things). But probably it is one layer of indirection too much. thinking more about it, maybe the pci ids is the best approach we can get.

dirkmueller avatar Feb 24 '16 10:02 dirkmueller

:+1:

tboerger avatar Mar 07 '16 09:03 tboerger

While ordering by bus-ids seems to be a good approach in the first place there are some additional things to be considered. E.g. systems using biosdevname for naming the nics (YaST does this for dell machines) might get an unexpected ordering. (Usually people would expect the interface ordered in a way to match what's printed on the server's chasis, as that's what biosdevname is supposed to do)

The main problem in the specific case this patch is addressing was btw, that it was a Dell system and while yast applied the ordering as defined in the bios (using biosdevname), sleshammer didn't do that (it's currently lacking biosdevname) and just orders in the detection order. I am pretty sure that it would have (mis)behaved in pretty much the same way on SLES11.

rhafer avatar Mar 08 '16 11:03 rhafer