Ipv6 routing not working for opam.ocaml.org
Hello! I am unable to run opam init on an ipv6-only host due to a routing issue to opam.ocaml.org. The domain resolves to 2001:bc8:5080:8e02::1 and 2001:bc8:1d80:4600::1. When I try to run tracepath it seems to go into a loop:
$ tracepath 2001:bc8:1d80:4600::1
[...SNIP...]
9: 2001:1900:5:2:2:0:11c:6b2 51.250ms asymm 8
10: 2001:bc8:1d00:1::1 51.624ms asymm 11
11: 2001:bc8:1d10:4::1 52.090ms
12: 2001:bc8:1d10:4::6 49.801ms asymm 11
13: 2001:bc8:1d10:4::1 52.756ms asymm 12
14: 2001:bc8:1d10:4::6 51.107ms asymm 11
15: 2001:bc8:1d10:4::1 52.004ms asymm 11
16: 2001:bc8:1d10:4::6 49.985ms asymm 11
17: 2001:bc8:1d10:4::1 52.133ms asymm 12
18: 2001:bc8:1d10:4::6 51.931ms asymm 11
19: 2001:bc8:1d10:4::1 51.175ms asymm 11
20: 2001:bc8:1d10:4::6 50.132ms asymm 11
21: 2001:bc8:1d10:4::1 51.262ms asymm 12
22: 2001:bc8:1d10:4::6 50.977ms asymm 11
23: 2001:bc8:1d10:4::1 51.414ms asymm 11
24: 2001:bc8:1d10:4::6 51.919ms asymm 11
25: 2001:bc8:1d10:4::1 51.293ms asymm 12
26: 2001:bc8:1d10:4::6 51.969ms asymm 11
27: 2001:bc8:1d10:4::1 52.046ms asymm 11
28: 2001:bc8:1d10:4::6 50.070ms asymm 11
29: 2001:bc8:1d10:4::1 53.301ms asymm 12
30: 2001:bc8:1d10:4::6 51.972ms asymm 11
Too many hops: pmtu 1500
Resume: pmtu 1500
@avsm The IPv6 addresses of these two machines appear to have changed. The current values are 2001:bc8:5080:a405::1 (opam-4) and 2001:bc8:1d80:4a00::1 (opam-5). Please can you update the DNS?
Now updated, and worryingly that went quite a long time without being noticed. It might be worth having a specific healthcheck somewhere for an IPv6-specific connection @mtelvers. Quite hard to spot this manually unless in an IPv6 only network (the only one of those I have is Mythic Beasts rPi hosting)
@avsm The IPv6 addresses appear to have changed again. opam-4 is now 2001:bc8:5080:a405::1 and opam-5 is now 2001:bc8:1d80:4a00::1.
Uhm, I don't quite understand your setup... But can't you statically configure IPv6 addresses, and advertise them in DNS?
@hannesm there's something wrong with the new Scaleway setup -- those advertised addresses shouldnt be changing.
@mtelvers the AAAA records already matched the ones you posted. They don't appear to have changed again -- what records are you seeing?
@avsm I used https://www.ssllabs.com/ssltest/analyze.html?d=opam.ocaml.org.
This was suggested by @hannesm on https://github.com/ocaml/opam/issues/5550#issuecomment-1547326886
@avsm I can also see the wrong address via
$ nslookup opam.ocaml.org 8.8.8.8
Server: 8.8.8.8
Address: 8.8.8.8#53
Non-authoritative answer:
Name: opam.ocaml.org
Address: 151.115.76.159
Name: opam.ocaml.org
Address: 51.158.232.133
Name: opam.ocaml.org
Address: 2001:bc8:1d80:4600::1
Name: opam.ocaml.org
Address: 2001:bc8:5080:8e02::1
It took me a while to figure out where I could report my issue. I first asked on #ocaml on libera.chat, and eventually I found this repository (that I have encountered before through #27).
I usually have ipv4 (often only ipv4), but I had spun up a virtual machine and chose to save a few cents by going ipv6 only. This is how I found out. I also tried to work around the issue by using the git repository, and by running an opam-mirr. However, due to GitHub not supporting ipv6 (another possible explanation why no one on ipv6-only connections have reported this problem) the git repository opam-repository was not accessible as well as a large portion of the source code and archives including the OCaml compiler itself. All in all opam seems a little brittle on ipv6. This probably deserves a separate issue.
As for testing I could imagine using OPAMFETCH with a curl, wget, fetch,... invocation that forces ipv6-only should do on a dual stack host - thereby not requiring an ipv6-only host for performing the test.
@reynir Yes, I found that test all the options using these four commands (on a dual-stack machine)
curl --resolve opam.ocaml.org:443:[2001:bc8:1d80:4a00::1] -o /dev/null https://opam.ocaml.org
curl --resolve opam.ocaml.org:443:[2001:bc8:5080:a405::1] -o /dev/null https://opam.ocaml.org
curl --resolve opam.ocaml.org:443:151.115.76.159 -o /dev/null https://opam.ocaml.org
curl --resolve opam.ocaml.org:443:51.158.232.133 -o /dev/null https://opam.ocaml.org
At the moment, only 3 out of 4 of these addresses match the published DNS entries. The last one in this output is wrong.
$ nslookup opam.ocaml.org ns1.gandi.net
Server: ns1.gandi.net
Address: 173.246.100.2#53
Name: opam.ocaml.org
Address: 151.115.76.159
Name: opam.ocaml.org
Address: 51.158.232.133
Name: opam.ocaml.org
Address: 2001:bc8:1d80:4a00::1
Name: opam.ocaml.org
Address: 2001:bc8:5080:8e02::1
Right, I looked in my JavaScript console on Gandi, and found internal 500 errors reported from the web UI. It looks like there was a glitch in the update UI for Gandi itself, so some of the changes just didn't go through. I've made the changes again, and they should all be propagating now.
I'm also surprised by the lack of GitHub IPv6-only support; see https://github.com/orgs/community/discussions/10539. It's something we can only partially fix via #29 since it doesn't solve the issue of how to create issues, even if we mirror our source code.
Mark, I'll give this one back to you to see if you'd like an IPv6-healthcheck. Otherwise it should be sorted now I think.
@avsm I have written an OCurrent pipeline that resolves the name, validates the certificates, tries to download from the website, and then posts the results to a Slack channel. http://observer.ocamllabs.io
cc. @tmcgilchrist
@avsm The IPv6 address of staging.ocaml.org should be 2001:bc8:1204:a40b::1 rather than 2001:bc8:1202:920c::1. Please could you update the DNS entry?
I have written an OCurrent pipeline that resolves the name, validates the certificates, tries to download from the website, and then posts the results to a Slack channel. http://observer.ocamllabs.io/
Looks great! Don't forget to add to ocurrent/overview when you upload the source code. This is distinct from the deployer, I presume?
We got an alert for an opam.ocaml.org ipv6 address
@mtelvers testing: Error: Command "ping" "-c" "4" "2001:bc8:5080:a405::1" exited with status 1
But https://www.ssllabs.com/ssltest/analyze.html?d=opam.ocaml.org shows that this address is fine, as do the logs in https://observer.ocamllabs.io/job/2024-08-09/153109-curl-5ae5d7
so maybe this was just a passing thing.