Can't find proper metadata source IP - Interoperability problem with CentOS8/Stream, NetworkManager and Apache CloudStack
This bug was originally filed in Launchpad as LP: #1915216
Launchpad details
affected_projects = [] assignee = None assignee_name = None date_closed = None date_created = 2021-02-09T23:35:27.906402+00:00 date_fix_committed = None date_fix_released = None id = 1915216 importance = medium is_complete = False lp_url = https://bugs.launchpad.net/cloud-init/+bug/1915216 milestone = None owner = jdoe666 owner_name = Peter M. private = False status = confirmed submitter = jdoe666 submitter_name = Peter M. tags = [] duplicates = []
Launchpad user Peter M.(jdoe666) wrote on 2021-02-09T23:35:27.906402+00:00
System environment: Apache CloudStack 4.11; KVM zone
In CentOS 8 either Upstream, there is NetworkManager. cloud-init currently packaged there is 20.3-9.el8.
We are talking about the code of the CloudStack datasource.
What we observe, is that on our CentOS test systems, cloud-init jumps into the default_gateway() method to return VR IP address 192.xxx.xxx.1. This is however wrong, this IP does not return metadata. To compare, an Ubuntu 20.04 deployed on same network resolves to 192.xxx.xxx.5.
This IP can be found under /run/NetworkManager:
./NetworkManager/resolv.conf:nameserver 192.xxx.xxx.5 ./NetworkManager/no-stub-resolv.conf:nameserver 192.xxx.xxx.5 ./NetworkManager/devices/2:next-server=192.xxx.xxx.5
While CloudStack datasource follows several approaches to find the IP, the code does not seem to implement the situation when there is NetworkManager.
What happens instead:
- first approach is to try data-server DNS entry first; this is up to our system, we will try out as well
- then, it looks for DHCP lease file location "/run/systemd/netif/leases". For some reason, this value is a hardcoded variable in net/dhcp.py: NETWORKD_LEASES_DIR = '/run/systemd/netif/leases'
- then, it finds lease file /var/lib/NetworkManager/internal-ea2b5464-7c5e-3243-aa40-7d77805f41ee-ens3.lease, but there is (as opposite to what we see in Ubuntu) just one line, "ADDRESS=192.xxx.xxx.34" - why this file does not contain the expected entry "SERVER_ADDRESS=192.xxx.xxx.5" as well, I am not sure.
- well and finally it is going to the default gateway method.
Would you say this is a bug, or maybe a missing feature to ensure interoperability with NetworkManager? (in terms that cloudinit does not look under /run/NetworkManager/)
Launchpad user Peter M.(jdoe666) wrote on 2021-02-10T00:20:36.209793+00:00
P.S. asked also at https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/658
Launchpad user James Falcon(falcojr) wrote on 2021-02-10T20:18:10.540098+00:00
Based on the process you've laid out, as well as the documentation (http://docs.cloudstack.apache.org/projects/cloudstack-administration/en/4.8/virtual_machines/user-data.html), it looks like the metadata service should be at the same IP as a DHCP server, which explains the steps taken. All the steps taken are various ways to determine your DHCP server, while falling back to your current gateway.
I'm not sure what is unique about your setup that these steps aren't working, however, checking "resolv.conf" isn't a valid solution. While it's true that a DHCP and DNS server may often reside at the same IP, that isn't guaranteed to be the case, and in most cases checking DNS is "more wrong" than inspecting DHCP leases.
Is the data-server DNS entry not working for you?
Launchpad user Dave(livegrenier) wrote on 2021-09-03T19:58:38.161908+00:00
Hello,
I am seeing the same problem under cloudstack 4.15 + Xen when using a shared network, since i am using a shared network the DHCP server is not the same as the gateway, therefor cloud-init ends up failing with the logs showing it is trying to use the gateway to fetch the metadata.
I see the same behaviour on CentOS 8 and Rocky Linux.
I have also attempted to play with the NETWORKD_LEASES_DIR setting but did not have any luck, i am open to provide more information or try any workarounds if someone can help.
Thanks.
Regards.
- Dave
Launchpad user Dave(livegrenier) wrote on 2021-10-13T07:56:45.746799+00:00
Hi,
Please let me know if i can provide any more info to help troubleshoot with this problem.
Thanks.