cloud-init icon indicating copy to clipboard operation
cloud-init copied to clipboard

cloud-init boots slowly on an EC2 instance with multiple NICs where the primary NIC is not first in the list

Open raineszm opened this issue 4 weeks ago • 0 comments

Bug report

A related issue was #6232 which was fixed in #6233. Now the EC2 datasource tries each of the NICs in order, waiting for one of them to completely time out before trying the rest.

Steps to reproduce the problem

Provision an EC2 instance with multiple NICs such that the primary NIC is not first in the list.

Environment details

  • Cloud-init version: 25.2-0ubuntu~24.04.1
  • Operating System Distribution: Ubuntu 24.04
  • Cloud provider, platform or installer type: AWS

cloud-init logs

The cloud-init-output.log has a large number of errors of the form

2025-12-04 20:30:54,763 - url_helper.py[WARNING]: Calling 'http://[fd00:ec2::254]/latest/api/token' failed [0/240s]: request error [HTTPConnectionPool(host='fd00:ec2::254', port=80): Max retries exceeded with url: /latest/api/token (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7060c212ac90>: Failed to establish a new connection: [Errno 101] Network is unreachable'))]

before the instance comes up at around 10 minutes

Cloud-init v. 25.2-0ubuntu1~24.04.1 running 'init' at Thu, 04 Dec 2025 20:39:12 +0000. Up 537.20 seconds.
ci-info: ++++++++++++++++++++++++++++++++++++++++Net device info+++++++++++++++++++++++++++++++++++++++++
ci-info: +-----------+-------+-----------------------------+---------------+--------+-------------------+
ci-info: |   Device  |   Up  |           Address           |      Mask     | Scope  |     Hw-Address    |
ci-info: +-----------+-------+-----------------------------+---------------+--------+-------------------+
ci-info: | ens1f0np0 | False |              .              |       .       |   .    | a0:88:c2:95:b8:d4 |
ci-info: | ens1f1np1 | False |              .              |       .       |   .    | a0:88:c2:95:b8:d5 |
ci-info: |   ens41   |  True |        192.168.44.66        | 255.255.240.0 | global | 0a:77:51:47:bc:37 |
ci-info: |   ens41   |  True | fe80::877:51ff:fe47:bc37/64 |       .       |  link  | 0a:77:51:47:bc:37 |
ci-info: |     lo    |  True |          127.0.0.1          |   255.0.0.0   |  host  |         .         |
ci-info: |     lo    |  True |           ::1/128           |       .       |  host  |         .         |
ci-info: +-----------+-------+-----------------------------+---------------+--------+-------------------+

Looking in cloud-init.log we see cloud-init bring up the first interface and start attempting to connect

2025-12-04 20:30:54,607 - ephemeral.py[DEBUG]: Successfully brought up ens1f0np0 for ephemeral ipv6 networking.
2025-12-04 20:30:54,610 - DataSourceEc2.py[DEBUG]: Removed the following from metadata urls: ['http://instance-data.:8773']
2025-12-04 20:30:54,610 - DataSourceEc2.py[DEBUG]: Fetching Ec2 IMDSv2 API Token
2025-12-04 20:30:54,611 - url_helper.py[DEBUG]: [0/1] open 'http://169.254.169.254/latest/api/token' with {'url': 'http://169.254.169.254/latest/api/token', 'stream': False, 'allow_redirects': True, 'method': 'PUT', 'timeout': 50.0, 'headers': {'X-aws-ec2-metadata-token-ttl-seconds': 'REDACTED', 'User-Agent': 'Cloud-Init/25.2-0ubuntu1~24.04.1'}} configuration
2025-12-04 20:30:54,762 - url_helper.py[DEBUG]: [0/1] open 'http://[fd00:ec2::254]/latest/api/token' with {'url': 'http://[fd00:ec2::254]/latest/api/token', 'stream': False, 'allow_redirects': True, 'method': 'PUT', 'timeout': 50.0, 'headers': {'X-aws-ec2-metadata-token-ttl-seconds': 'REDACTED', 'User-Agent': 'Cloud-Init/25.2-0ubuntu1~24.04.1'}} configuration
2025-12-04 20:30:54,762 - url_helper.py[DEBUG]: Exception(s) [UrlError("HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /latest/api/token (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7060c21289e0>: Failed to establish a new connection: [Errno 101] Network is unreachable'))"), UrlError("HTTPConnectionPool(host='fd00:ec2::254', port=80): Max retries exceeded with url: /latest/api/token (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7060c212ac90>: Failed to establish a new connection: [Errno 101] Network is unreachable'))")] during request to http://[fd00:ec2::254]/latest/api/token, raising last exception
2025-12-04 20:30:54,763 - url_helper.py[WARNING]: Calling 'http://[fd00:ec2::254]/latest/api/token' failed [0/240s]: request error [HTTPConnectionPool(host='fd00:ec2::254', port=80): Max retries exceeded with url: /latest/api/token (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7060c212ac90>: Failed to establish a new connection: [Errno 101] Network is unreachable'))]
2025-12-04 20:30:54,763 - url_helper.py[DEBUG]: Please wait 1 seconds while we wait to try again

this repeats with the wait time increasing every few tries from 1 second to 9 seconds. Then cloud-init tries again on the second interface

2025-12-04 20:35:17,138 - ephemeral.py[DEBUG]: Successfully brought up ens1f1np1 for ephemeral ipv6 networking.
2025-12-04 20:35:17,138 - DataSourceEc2.py[DEBUG]: Removed the following from metadata urls: ['http://instance-data.:8773']
2025-12-04 20:35:17,138 - DataSourceEc2.py[DEBUG]: Fetching Ec2 IMDSv2 API Token
2025-12-04 20:35:17,139 - url_helper.py[DEBUG]: [0/1] open 'http://169.254.169.254/latest/api/token' with {'url': 'http://169.254.169.254/latest/api/token', 'stream': False, 'allow_redirects': True, 'method': 'PUT', 'timeout': 50.0, 'headers': {'X-aws-ec2-metadata-token-ttl-seconds': 'REDACTED', 'User-Agent': 'Cloud-Init/25.2-0ubuntu1~24.04.1'}} configuration
2025-12-04 20:35:17,290 - url_helper.py[DEBUG]: [0/1] open 'http://[fd00:ec2::254]/latest/api/token' with {'url': 'http://[fd00:ec2::254]/latest/api/token', 'stream': False, 'allow_redirects': True, 'method': 'PUT', 'timeout': 50.0, 'headers': {'X-aws-ec2-metadata-token-ttl-seconds': 'REDACTED', 'User-Agent': 'Cloud-Init/25.2-0ubuntu1~24.04.1'}} configuration
2025-12-04 20:35:17,291 - url_helper.py[DEBUG]: Exception(s) [UrlError("HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /latest/api/token (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7060c218c1d0>: Failed to establish a new connection: [Errno 101] Network is unreachable'))"), UrlError("HTTPConnectionPool(host='fd00:ec2::254', port=80): Max retries exceeded with url: /latest/api/token (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7060c218edb0>: Failed to establish a new connection: [Errno 101] Network is unreachable'))")] during request to http://[fd00:ec2::254]/latest/api/token, raising last exception
2025-12-04 20:35:17,291 - url_helper.py[WARNING]: Calling 'http://[fd00:ec2::254]/latest/api/token' failed [0/240s]: request error [HTTPConnectionPool(host='fd00:ec2::254', port=80): Max retries exceeded with url: /latest/api/token (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7060c218edb0>: Failed to establish a new connection: [Errno 101] Network is unreachable'))]
2025-12-04 20:35:17,291 - url_helper.py[DEBUG]: Please wait 1 seconds while we wait to try again

Finally cloud-init tries the third interface which succeeds

2025-12-04 20:39:10,095 - ephemeral.py[DEBUG]: Successfully brought up ens41 for ephemeral ipv4 networking.
2025-12-04 20:39:10,095 - util.py[DEBUG]: Reading from /sys/class/net/ens41/operstate (quiet=False)
2025-12-04 20:39:10,095 - util.py[DEBUG]: Reading 3 bytes from /sys/class/net/ens41/operstate
2025-12-04 20:39:10,095 - ephemeral.py[DEBUG]: Successfully brought up ens41 for ephemeral ipv6 networking.
2025-12-04 20:39:10,095 - DataSourceEc2.py[DEBUG]: Removed the following from metadata urls: ['http://instance-data.:8773']
2025-12-04 20:39:10,095 - DataSourceEc2.py[DEBUG]: Fetching Ec2 IMDSv2 API Token
2025-12-04 20:39:10,096 - url_helper.py[DEBUG]: [0/1] open 'http://169.254.169.254/latest/api/token' with {'url': 'http://169.254.169.254/latest/api/token', 'stream': False, 'allow_redirects': True, 'method': 'PUT', 'timeout': 50.0, 'headers': {'X-aws-ec2-metadata-token-ttl-seconds': 'REDACTED', 'User-Agent': 'Cloud-Init/25.2-0ubuntu1~24.04.1'}} configuration
2025-12-04 20:39:10,103 - url_helper.py[DEBUG]: Read from http://169.254.169.254/latest/api/token (200, 56b) after 1 attempts
2025-12-04 20:39:10,103 - DataSourceEc2.py[DEBUG]: Using metadata source: 'http://169.254.169.254'

and cloud-init happily continues on its way.

raineszm avatar Dec 11 '25 17:12 raineszm