image-factory
image-factory copied to clipboard
Image factory IPv6 endpoint not working
I have an IPv6 enabled network and I can't download factory images.
❯ ping -c3 google.com
PING google.com (216.58.208.110): 56 data bytes
64 bytes from 216.58.208.110: icmp_seq=0 ttl=60 time=6.823 ms
64 bytes from 216.58.208.110: icmp_seq=1 ttl=60 time=6.328 ms
64 bytes from 216.58.208.110: icmp_seq=2 ttl=60 time=6.107 ms
--- google.com ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 6.107/6.419/6.823/0.299 ms
# ==============================================================
❯ ping6 -c3 google.com
PING6(56=40+8+8 bytes) [REDACTED] --> 2a00:1450:400e:811::200e
16 bytes from 2a00:1450:400e:811::200e, icmp_seq=0 hlim=119 time=2.898 ms
16 bytes from 2a00:1450:400e:811::200e, icmp_seq=1 hlim=119 time=1.812 ms
16 bytes from 2a00:1450:400e:811::200e, icmp_seq=2 hlim=119 time=1.852 ms
--- google.com ping6 statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/std-dev = 1.812/2.187/2.898/0.503 ms
# ==============================================================
❯ ping -c3 factory.talos.dev
PING factory.talos.dev (4.157.170.221): 56 data bytes
64 bytes from 4.157.170.221: icmp_seq=0 ttl=112 time=86.853 ms
64 bytes from 4.157.170.221: icmp_seq=1 ttl=112 time=85.983 ms
64 bytes from 4.157.170.221: icmp_seq=2 ttl=112 time=85.838 ms
--- factory.talos.dev ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 85.838/86.225/86.853/0.448 ms
# ==============================================================
❯ ping6 -c3 factory.talos.dev
PING6(56=40+8+8 bytes) [REDACTED] --> 2603:1030:20e:3::4a2
--- factory.talos.dev ping6 statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss
ICMP might be blocked, curl also is not working.
❯ curl -v6 -s --connect-timeout 10 https://google.com > /dev/null
* Trying [2a00:1450:400e:811::200e]:443...
* Connected to google.com (2a00:1450:400e:811::200e) port 443
* ALPN: curl offers h2,http/1.1
} [5 bytes data]
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* CAfile: /Users/sartorijuniorw/allCAbundle.pem
* CApath: none
{ [5 bytes data]
* TLSv1.3 (IN), TLS handshake, Server hello (2):
{ [122 bytes data]
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
{ [15 bytes data]
* TLSv1.3 (IN), TLS handshake, Certificate (11):
{ [6485 bytes data]
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
{ [80 bytes data]
* TLSv1.3 (IN), TLS handshake, Finished (20):
{ [52 bytes data]
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
} [1 bytes data]
* TLSv1.3 (OUT), TLS handshake, Finished (20):
} [52 bytes data]
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN: server accepted h2
* Server certificate:
* subject: CN=*.google.com
* start date: Oct 23 11:18:24 2023 GMT
* expire date: Jan 15 11:18:23 2024 GMT
* subjectAltName: host "google.com" matched cert's "google.com"
* issuer: C=US; O=Google Trust Services LLC; CN=GTS CA 1C3
* SSL certificate verify ok.
} [5 bytes data]
* using HTTP/2
* [HTTP/2] [1] OPENED stream for https://google.com/
* [HTTP/2] [1] [:method: GET]
* [HTTP/2] [1] [:scheme: https]
* [HTTP/2] [1] [:authority: google.com]
* [HTTP/2] [1] [:path: /]
* [HTTP/2] [1] [user-agent: curl/8.4.0]
* [HTTP/2] [1] [accept: */*]
} [5 bytes data]
> GET / HTTP/2
> Host: google.com
> User-Agent: curl/8.4.0
> Accept: */*
>
{ [5 bytes data]
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
{ [282 bytes data]
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
{ [282 bytes data]
* old SSL session ID is stale, removing
{ [5 bytes data]
< HTTP/2 301
< location: https://www.google.com/
< content-type: text/html; charset=UTF-8
< content-security-policy-report-only: object-src 'none';base-uri 'self';script-src 'nonce-VJ3oVJ_nx2_nUkHfo3Zsrg' 'strict-dynamic' 'report-sample' 'unsafe-eval' 'unsafe-inline' https: http:;report-uri https://csp.withgoogle.com/csp/gws/other-hp
< date: Fri, 17 Nov 2023 14:15:29 GMT
< expires: Fri, 17 Nov 2023 14:15:29 GMT
< cache-control: private, max-age=2592000
< server: gws
< content-length: 220
< x-xss-protection: 0
< x-frame-options: SAMEORIGIN
< set-cookie: CONSENT=PENDING+712; expires=Sun, 16-Nov-2025 14:15:29 GMT; path=/; domain=.google.com; Secure
< p3p: CP="This is not a P3P policy! See g.co/p3phelp for more info."
< alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
<
{ [220 bytes data]
* Connection #0 to host google.com left intact
# ==============================================================
❯ curl -v6 -s --connect-timeout 10 https://factory.talos.dev > /dev/null
* Trying [2603:1030:20e:3::4a2]:443...
* ipv6 connect timeout after 9999ms, move on!
* Failed to connect to factory.talos.dev port 443 after 10005 ms: Timeout was reached
* Closing connection
Can confirm with certainty! We've seen the same behavior when one of our testers had a dualstack interface and it choose to try to grab the image over ipv6. We had to resort to, by default, kernel-disabling ipv6. Which, quite frankly, is not a good solution to the problem at all.
We are in the process of fixing it
Awesome and thank you! :)
@smira I ran into the same problem while preparing some hetzner root servers. could not download directly from https://factory.talos.dev
@smira I ran into the same problem while preparing some hetzner root servers. could not download directly from factory.talos.dev
This has nothing to do with IPv6, as factory doesn't have IPv6 records. I bet Hetzner owns some IPs which are identified as coming from Iran (or something like that) and they are geo-blocked by the cloud. There is no fix we can do on our side.
Hmm good to know, I'll provide my own install urls from another server.
Is this something you added recently? I could directly download images from that same IP out of Germany couple of months ago now I'll get a timeout.
We don't do geo-blocking ourselves, but the cloud where Image Factory runs might do that outside of our control.
Once again, just a guess, and completely unrelated to this issue.
We don't do geo-blocking ourselves, but the cloud where Image Factory runs might do that outside of our control.
Once again, just a guess, and completely unrelated to this issue.
Mind sharing the cloud provider btw?
We don't do geo-blocking ourselves, but the cloud where Image Factory runs might do that outside of our control. Once again, just a guess, and completely unrelated to this issue.
Mind sharing the cloud provider btw?
I don't think this is related, and it might change in the future. Let's stick to the issue topic.
Image Factory doesn't support IPv6 still, and this is the case. Everything else if actually a problem deserves its own issue.
We don't do geo-blocking ourselves, but the cloud where Image Factory runs might do that outside of our control. Once again, just a guess, and completely unrelated to this issue.
Mind sharing the cloud provider btw?
I don't think this is related, and it might change in the future. Let's stick to the issue topic.
Image Factory doesn't support IPv6 still, and this is the case. Everything else if actually a problem deserves its own issue.
I would like to add that a lot of kubernetes distributions have rather... quircky ipv6 defaults anyway. So for most users it might be advisable to disable ipv6 entirely in their cluster.
I tried overriding the default NTP and DNS servers using the ip=:::::::<dns0-ip>:<dns1-ip>:<ntp0-ip> kernel argument, using a DNS64 server combined with a NAT64 server instead. This sort of works, but it's not convenient at all.
@smira suggested me to use a pull-through IPv6 mirror to factory.talos.dev and that actually works great 😄
For anyone looking for a quick solution: you can use https://factory.lion7.dev as an IPv6 alternative (it's a IPv6 pull-through cache to https://factory.talos.dev). Ofcourse, no guarantees on availability 🤞🏻
Holy smokes, I banged my head against this for two days trying to add a node to the cluster, until I found this thread. factory.lion7.dev pull-through works. That was super confusing and frustrating.