cloud-init icon indicating copy to clipboard operation
cloud-init copied to clipboard

fix(azure): use primary NIC when iface is None in ephemeral DHCP setup

Open cadejacobson opened this issue 2 months ago • 5 comments

Proposed Commit Message

fix(azure): ensure ephemeral networking uses primary NIC

When ephemeral networking is brought up in Azure, the selected interface
(`iface`) is `None` under certain conditions. This change introduces 
a helper function `find_primary_nic()` which uses 
`net.find_candidate_nics()` to determine and select the system's primary 
network interface if one is not explicitly set.

Additional improvements:
- Log the MAC address and driver of the selected NIC for easier debugging.
- Ensure the DHCP retry loop updates `iface` to the current primary NIC before
  retrying to obtain a lease.

Fixes 6558 

Updated Output

After these changes, the output for creating a VM reads:

2025-11-04 21:47:15,765 - azure.py[DEBUG]: Bringing up ephemeral networking with iface=eth0 mac=7c:1e:52:e8:55:f9 driver=hv_netvsc : [('lo', '00:00:00:00:00:00', None, None), ('eth1', '7c:1e:52:e8:5e:80', 'hv_netvsc', '0x3'), ('eth0', '7c:1e:52:e8:55:f9', 'hv_netvsc', '0x3')]

Merge type

  • [x] Squash merge using "Proposed Commit Message"

Fixes #6558

cadejacobson avatar Nov 03 '25 15:11 cadejacobson

Thanks for the contribution.

Please link this to the bug that this is supposed to fix. If none exists, one should be created containing full context (logs where this happened with any other relevant details).

When ephemeral networking is brought up in Azure, the selected interface (iface) may be None under certain conditions.

Which conditions? Were those conditions reproducable and was this demonstrated to fix the problem when that occurs?

holmanb avatar Nov 04 '25 01:11 holmanb

Thanks for the contribution.

Please link this to the bug that this is supposed to fix. If none exists, one should be created containing full context (logs where this happened with any other relevant details).

When ephemeral networking is brought up in Azure, the selected interface (iface) may be None under certain conditions.

Which conditions? Were those conditions reproducable and was this demonstrated to fix the problem when that occurs?

Sounds good! I will get a bug made with the relevant context. For a very short overview here, three out of four calls to the _setup_ephemeral_networking() function do not provide an explicit interface in the parameters so it defaults to None. When we enter the ephemeral networking function and log which interface we are bringing up networking on, the log is simply Bringing up ephemeral networking with iface=None. I will get more verbose logs and these conditions mapped out in the issue and link it back here!

Thank you for the guidance 😁 this is my first PR to Cloud-init, so I left this as a draft while I get the required materials compiled for a proper review.

cadejacobson avatar Nov 04 '25 18:11 cadejacobson

@cadejacobson Thanks for filing the bug. No worries - and welcome!

holmanb avatar Nov 04 '25 20:11 holmanb

@holmanb I have gotten this work to its final state. Could I please have you run the pipelines one more time to verify we do not need minor changes? Thanks so much!

cadejacobson avatar Nov 06 '25 23:11 cadejacobson

It looks like this is waiting for updates per the latest review, so I'll hold off on reviewing it for now.

holmanb avatar Nov 20 '25 19:11 holmanb