micromirrors icon indicating copy to clipboard operation
micromirrors copied to clipboard

A mirror for Guix?

Open lechner opened this issue 2 years ago • 9 comments

Hi,

Would the Micro Mirror project potentially be interested in mirroring Guix?

I follow your Mastodon posts about the HP T620 devices and other topics with great interest!

Guix currently serves about 4.3 TB of continuously built artifacts from servers in Berlin (at the Max Delbrück Center for Molecular Medicine) and Bordeaux. We are popular in the academic high-performance segment and are going mainstream.

I personally live about two miles from Hurricane Electric's FMT2 data center.

Thanks for considering!

Kind regards Felix Lechner

lechner avatar Dec 20 '22 22:12 lechner

We can certainly explore pulling it into mirror.fcix.net. The Micro Mirrors are reserved for the highest demand projects, so it's unlikely that Guix has the CDN efficiency numbers to justify it.

The one thing we do require is a standard rsync upstream and load balancer to direct users to our mirror. Does Guix have that set up? I'm not seeing any mention of a MirrorBits or MirrorBrain running.

PhirePhly avatar Dec 21 '22 00:12 PhirePhly

It looks like Guix is carried as part of the GNU project. The download links on https://guix.gnu.org/en/download/ seem to unfortunately point only at the tier0 GNU server and not the load balanced URL that redirects to the mirrors.

Is that the build artifacts you're talking about, or is there another folder of artifacts prior to what ends up under /gnu/?

PhirePhly avatar Dec 21 '22 00:12 PhirePhly

Hi,

Having followed your blog, I believe that your proficiency with networks exceeds that of our six hundred contributors combined.

Your secondary DNS service, for example, is the coolest thing I have seen in a while. I was so inspired that I packaged bgpq3 for Guix in hope of eventually winning you as a user for our "one-minute" rapid deployment.

In Guix, system configurations are usually a single file. I regularly move services from one device to another using just copy and paste, followed by guix deploy.

The build artifacts are served from https://ci.guix.gnu.org/ and https://bordeaux.guix.gnu.org/. Both are substitute servers, which means that users can draw on prepackaged build artifacts if their own equipment is too slow to build for themselves.

It usually is.

I spoke with the operators of those two servers and received permission to contact you. They know about your rsync requirement and your DNS multicast wizardry, but I'm a little bit helpless beyond that.

Kind regards Felix Lechner

lechner avatar Dec 21 '22 00:12 lechner

Got it, so the substitute servers are kind of like the macports mirrors. You're also too kind.

  1. Ignoring the build artifacts for a moment, I think all the download links on https://guix.gnu.org/en/download/ should be pointed at ftpmirror.gnu.org/... instead of ftp.gnu.org/..., because then mirror.fcix.net and all the other community mirrors are in play for your stable releases, which aren't currently the case.

  2. For the substitute servers, how do clients pick one or the other? Is there a single URL that the build clients use which gets 302ed to these two servers and we're looking at adding a third one? Or does the client retrieve a list of servers to pull from and select it on their own? Essentially, if we pull this folder into mirror.fcix.net, how will clients automatically start using us?

  3. How big of a folder of build artifacts are we talking about here? The /gnu/guix/ folder is only 11GB, so would it be right to liken the two substitute servers as the package repo where the /gnu/ folder is just the install media? Is the 4.3TB number the folder size? or the daily traffic number?

PhirePhly avatar Dec 21 '22 01:12 PhirePhly

Hi,

I googled "Macports mirrors" and suppose they are similar. Substitute servers are different in that they continuously rebuild packages whose prerequisites have changed.

In Guix, all those package versions can peacefully co-exist (and regular users can install them at the same time without superuser privileges). When you install a Guix software package you also get the exact prerequisites with which the software was built.

Everything is fully reproducible. The design is different from many other Linux distributions, but the system eliminates entire classes of bugs and uncertainties. For example, there are no more missing shared library symbols.

Regarding your point 1, I will talk to the Guix web folks about linking to https://ftpmirror.gnu.org instead. Thank you for the suggestion!

The releases are used only for initial installations. Users will do a guix pull at their earliest opportunity to get the latest software.

As for your point 2, clients know about both substitute servers. I am not sure how they are prioritized at the moment. The servers rebuild packages independently. Having two different servers, plus the user's own equipment, allows users to check that their software is reproducible.

A single timestamp in the build products will render a build not reproducible.

The best way to start using a new mirror might be your DNS multicast magic. Does it work only for small UDP queries or also for large file transfers?

Regarding your point 3, I know that bordeaux keeps 4.3 TB of substitutes on hand. Some of those are dated and made available merely because users are generally slow to issue guix pull. (Due to poor speeds currently, it tends to be a somewhat expensive operation.) As I hinted before, that command causes users to request the latest versions of all packages going forward.

In addition, the substitute servers keep older versions on hand to enable client-side rollbacks in case there is unexpected breakage. On occasion, a new version of something is defective. Users can then just reinstall the previous version. In fact, both versions can be installed at the same time.

The ingenious use of symbolic links and environment variables makes that magic possible. I spent a long time with a major Linux distribution. Guix is the most exciting operating system I have seen in many years.

Kind regards Felix Lechner

lechner avatar Dec 21 '22 02:12 lechner

Hi,

Per your kind instructions, all links on the download page (item 1) now point to https://ftpmirror.gnu.org.

Felix

lechner avatar Dec 21 '22 22:12 lechner

To follow up on this, looking at the two Guix build servers, it seems like if Guix wanted to scale serving the pre-built packages horizontally, it would make sense to set up a real mirroring infrastructure using something like MirrorBits or MirrorBrain and have a /guix/ and /guix-vault/ modules so the latest builds could be served from several mirrors, and if users wanted to roll back to an older build those could be available in the vault module.

PhirePhly avatar Jan 27 '23 18:01 PhirePhly

Hi Ken,

You know everything about networking. Would you please be so kind to offer advice why GNU Guix has poor download speeds?

The main artifact server in Berlin, Germany serves files at great speeds in excess of 30 MB/s to some folks in Europe, but only at 500 kB/sec to many others (including myself). Many believe the issue is peering.

The fast connections are mostly located on European research networks. They probably tie directly to Germany's DFN. Is the poor connectivity in the U.S. related to the Hurricane Electric / Cogent problem, or is there another peering issue?

Here some stats from people for a related set of backup servers.

I tie into Hurricane Electric via Sail Internet, a Fremont ISP, but my speeds are also poor. My static IP is 208.82.101.137.

Here is a large test file that may help with a diagnosis.

wget https://ci.guix.gnu.org/nar/lzip/c6qpr19zjlmvb729d01n7p7pfn65a9il-stellarium-23.2

Thank you so much!

Kind regards Felix

lechner avatar Aug 30 '23 18:08 lechner

I'm seeing the same 600kBps-1MBps sometimes, but then other times I'm seeing north of 100Mbps. Regardless of congested peering, not particularly shocking given how physically far away the box is. With a lot of work you might be able to find a specific smoking gun link, but getting anyone to fix it is unlikely.

[mirror@codingflyboy ~]$ wget https://ci.guix.gnu.org/nar/lzip/c6qpr19zjlmvb729d01n7p7pfn65a9il-stellarium-23.2 -O /dev/null
--2023-08-30 11:22:15--  https://ci.guix.gnu.org/nar/lzip/c6qpr19zjlmvb729d01n7p7pfn65a9il-stellarium-23.2
Resolving ci.guix.gnu.org (ci.guix.gnu.org)... 141.80.181.40
Connecting to ci.guix.gnu.org (ci.guix.gnu.org)|141.80.181.40|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 262435305 (250M) [application/octet-stream]
Saving to: ‘/dev/null’

/dev/null           100%[===================>] 250.28M  11.8MB/s    in 39s

2023-08-30 11:22:56 (6.38 MB/s) - ‘/dev/null’ saved [262435305/262435305]

[mirror@codingflyboy ~]$ ping ci.guix.gnu.org
PING ci.guix.gnu.org (141.80.181.40) 56(84) bytes of data.
64 bytes from 141.80.181.40 (141.80.181.40): icmp_seq=1 ttl=51 time=145 ms
64 bytes from 141.80.181.40 (141.80.181.40): icmp_seq=2 ttl=51 time=144 ms
64 bytes from 141.80.181.40 (141.80.181.40): icmp_seq=3 ttl=51 time=144 ms
64 bytes from 141.80.181.40 (141.80.181.40): icmp_seq=4 ttl=51 time=144 ms
^C
--- ci.guix.gnu.org ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3003ms
rtt min/avg/max/mdev = 144.353/144.419/144.531/0.068 ms

So given that 144ms latency, I would suspect that the speed of light is more your issue than any specific congested peering, but it's also possible there's some congestion somewhere hurting it.

The real fix is not having users downloading from 140ms away if you want downloads to be faster than that. Time to set up a MirrorBits instance and routing clients to servers on the same continent as them.

PhirePhly avatar Aug 30 '23 18:08 PhirePhly