brew icon indicating copy to clipboard operation
brew copied to clipboard

Update bottling Linux distribution to Ubuntu 22.04

Open sjackman opened this issue 1 year ago • 47 comments

Provide a detailed description of the proposed feature

The current bottling distribution is Ubuntu 16.04. I propose updating the bottling distribution to a more recent version of Ubuntu. See https://ubuntu.com/about/release-cycle

Distribution Glibc GCC
CentOS 7 2.17 4.8.5
Ubuntu 16.04 2.23 5.4.0
Ubuntu 18.04 2.27 7.5.0
Debian 10.12 2.28 8.3.0
Ubuntu 20.04 2.31 9.4.0
Debian 11.4 2.31 10.2.1
Ubuntu 22.04 2.35 11.2.0

What is the motivation for the feature?

Linux bottles ought to be built using the GCC provided by the bottling infrastructure. Some ~300 bottles fail to build using the GCC 5.4.0 provided by Ubuntu 16.04, because they require C++17 or C++20.

How will the feature be relevant to at least 90% of Homebrew users?

These ~300 bottles that currently depend on brewed GCC will no longer depend on brewed GCC.

What alternatives to the feature have been considered?

No good alternatives exist. It is widely accepted that we must upgrade the bottling infrastructure OS.

Tasks and open PRs

Please sort these tasks and PRs in the order that they ought to be completed and merged.

  • [x] https://github.com/Homebrew/brew/pull/13618
  • [x] https://github.com/Homebrew/homebrew-core/pull/100411
  • [x] https://github.com/Homebrew/homebrew-core/pull/106755
  • [x] https://github.com/Homebrew/glibc-bootstrap/pull/3
  • [x] https://github.com/Homebrew/homebrew-core/pull/106837
  • [x] https://github.com/Homebrew/brew/pull/13751
  • [x] https://github.com/Homebrew/brew/pull/13577
  • [x] https://github.com/Homebrew/brew/pull/13758
  • [x] https://github.com/Homebrew/homebrew-core/pull/109272
  • [x] https://github.com/Homebrew/brew/pull/13770
  • [x] https://github.com/Homebrew/homebrew-core/pull/108816
  • [x] https://github.com/Homebrew/brew/pull/13733
  • [x] Tag Homebrew 3.6.0
  • [x] https://github.com/Homebrew/homebrew-core/pull/108590
  • [x] https://github.com/Homebrew/brew/pull/13819
  • [ ] https://github.com/Homebrew/homebrew-core/issues/110010

sjackman avatar Jul 29 '22 16:07 sjackman

Ubuntu 18.04 is EOL next year, so I strongly advise against switching to that since I don't want us to be having this discussion again so soon.

Moving forward, I think it would be helpful to formalise and document the process of upgrading our bottling infrastructure, so it can be done more easily the next time we need to do it. Ideally, I'd like for it to be just changing some variables stored somewhere, but that might be asking for too much.

carlocab avatar Jul 29 '22 16:07 carlocab

Ubuntu 18.04 is EOL next year, so I strongly advise against switching to that since I don't want us to be having this discussion again so soon.

Note that LTS releases have security updates for ten years, and so Ubuntu 18.04 would continue to receive security updates until 2028.

For each Ubuntu LTS release, Canonical maintains the Base Packages and provides security updates, including kernel livepatching, for a period of ten years.

https://ubuntu.com/about/release-cycle

sjackman avatar Jul 29 '22 17:07 sjackman

I'm not as concerned about security updates as I am about being able to build new software. My impression is that a given Ubuntu version going into ESM is a nail in the coffin of that version's compiler: developers are less likely to consider it a target platform and are therefore more likely to adopt features that require a newer GCC.

That said, I won't oppose using Ubuntu 18.04 if we agree to not bottle any formulae that don't build with GCC 7.

To be clear, I think that we shouldn't bottle anything that doesn't build with the host compiler on Linux, regardless of the Ubuntu version. The intensity of this opinion is decreasing in the distro version we decide to use.

carlocab avatar Jul 29 '22 17:07 carlocab

Thanks for opening @sjackman!

I propose updating the bottling distribution to a more recent version of Ubuntu.

@sjackman @danielnachun @iMichka can you spell out what you see as the pros and cons of e.g. Ubuntu vs. Debian vs. Centos? I'm interested in evaluations based on how up-to-date they are, how long they are supported for, etc.

Distribution

Might be worth adding to these a column for when they are supported/EOL/security updates only?

Linux bottles ought to be built using the GCC provided by the bottling infrastructure.

Does this mean "the default /usr/bin/gcc from the host operating system?

These ~300 bottles that currently depend on brewed GCC will no longer depend on brewed GCC.

What happens on OS versions that e.g. ship with an older GCC?

Please sort these tasks and PRs in the order that they ought to be completed and merged.

Some suggested additions from me:

  • documentation about how to update Linux versions and a heuristic for deciding when to upgrade in future e.g. we upgrade to every Ubuntu LTS release within a year of its release
  • the ability to use a matching host compiler on relevant systems where it's provided and install the relevant compiler elsewhere e.g. by doing brew install gcc@something automatically
    • bonus points if we somehow allow apt-get install gcc-something which is installed in /usr/bin/gcc-something and we still do the right thing

Moving forward, I think it would be helpful to formalise and document the process of upgrading our bottling infrastructure, so it can be done more easily the next time we need to do it. Ideally, I'd like for it to be just changing some variables stored somewhere, but that might be asking for too much.

I'm not as concerned about security updates as I am about being able to build new software. My impression is that a given Ubuntu version going into ESM is a nail in the coffin of that version's compiler: developers are less likely to consider it a target platform and are therefore more likely to adopt features that require a newer GCC.

Strongly agree with all of this. I'd like us to have a macOS-like process that involves us making small and regular updates to the host system we use.

To be clear, I think that we shouldn't bottle anything that doesn't build with the host compiler on Linux, regardless of the Ubuntu version. The intensity of this opinion is decreasing in the distro version we decide to use.

@carlocab i.e. never depends_on "gcc"? I would be cautiously in favour of this, partly as a forcing function for us to use a newer rather than older version of Ubuntu/Debian/CentOS.

MikeMcQuaid avatar Jul 29 '22 18:07 MikeMcQuaid

@carlocab i.e. never depends_on "gcc"?

Well, not never, but probably closer to it than where we are now. depends_on "gcc" is fine if you need fortran, but that's currently only a small share of the GCC dependents on Linux.

I think we need to grapple with the tradeoff between portability (i.e. working on older Linux distros) and bottle coverage. Trying to have them both results in Homebrew/core being too difficult to maintain, so I suggest we decide which one is more important to us.

carlocab avatar Jul 30 '22 03:07 carlocab

Now that I've gotten glibc bootstrapping to working on CentOS 7 (see https://github.com/Homebrew/homebrew-core/pull/106837), we no longer need to worry about portability to even the oldest support versions of Linux for at least the next several years, and I am hoping long before then that we'll have gotten glibc an :any cellar. glibc is the only place in brew or homebrew-core where we have to deal with concerns about portability - once it's installed all other bottles should work, and source builds should only fail due to opportunistic linkage or shim bypasses that we actively want to fix.

The only factor I now think we need to consider in how we decide on the upgrade schedule for CI is how many users we want to have to use brewed glibc and gcc. Given that gcc is relocatable and glibc should now very easy to build across a very broad range of systems, I am personally in favor of aggressively updating to the newest Ubuntu LTS as soon as it's available, meaning we would move to Ubuntu 22.04 LTS now.

While doing this will mean a larger percentage of our users will be relying on brewed gcc and glibc, the only major downside I can see to this is a modest increase in disk usage of ~350MB at most. There do seem to be occasional issues with build failures when using brewed glibc, but these always seem to be due to shim bypasses that we should actually try to fix. Furthermore, if the trend I observed while testing binary patching holds, about 95% of our non-relocatable formulae will become relocatable, making this mostly a moot point.

There may also be some failures even in CI when building older C++ packages with a newer GCC, but as one of the maintainers who has probably spent the most time dealing C++ API issues recently, I can say from personal experience that I would overwhelmingly prefer spending my time fixing older code to work with newer versions of GCC. These fixes usually entail nothing more than adding a few missing include statements, while the opposite problem of building with a newer GCC when the CI host is too old can be a total nightmare, especially if the formula has C++ dependents.

The advantage of more aggressive updating is that it makes maintenance much easier. If we're updating our compiler toolchain every 2 years to what is basically the latest version, this will mean our support for C++ standards on Linux will be on par with if not even better than what we have on macOS. In my view we should prioritize maintainer free time/sanity over saving users some disk space. However, I may be overlooking some other downsides to more aggressive updating and am open to hearing other opinions.

danielnachun avatar Jul 31 '22 03:07 danielnachun

@sjackman you might also want to add that we need to change the preferred gcc in https://github.com/Homebrew/brew/blob/9c03493774500cf16ced8938e1eb4eeae8216b20/Library/Homebrew/extend/os/linux/compilers.rb to your list?

iMichka avatar Jul 31 '22 20:07 iMichka

Well, not never, but probably closer to it than where we are now. depends_on "gcc" is fine if you need fortran, but that's currently only a small share of the GCC dependents on Linux.

I think it might be worth us allowing depends_on "gfortran" to use an alias in this specific case to avoid confusion.

I think we need to grapple with the tradeoff between portability (i.e. working on older Linux distros) and bottle coverage. Trying to have them both results in Homebrew/core being too difficult to maintain, so I suggest we decide which one is more important to us.

Agreed. To me bottle coverage is unambiguously higher priority.

I am personally in favor of aggressively updating to the newest Ubuntu LTS as soon as it's available, meaning we would move to Ubuntu 22.04 LTS now. ... The advantage of more aggressive updating is that it makes maintenance much easier. If we're updating our compiler toolchain every 2 years to what is basically the latest version, this will mean our support for C++ standards on Linux will be on par with if not even better than what we have on macOS. In my view we should prioritize maintainer free time/sanity over saving users some disk space.

Strongly agreed with all of this. I think we should aim to have similar C++ support on macOS and Linux. I don't mind if we have a longer build-from-source tail on both but we should be clearer with our installation and runtime messaging when we expect things may break and that people should submit PRs on Linux rather than filing issues (like we do with older macOS versions).

MikeMcQuaid avatar Aug 01 '22 13:08 MikeMcQuaid

I think it might be worth us allowing depends_on "gfortran" to use an alias in this specific case to avoid confusion.

This would help to clarify things a lot and is very simple to implement.

Agreed. To me bottle coverage is unambiguously higher priority.

Fortunately there is no tension between portability and bottle coverage now that we better understand the glibc bootstrapping process. The real limitation in how old of a Linux system we can support is the build requirements for bootstrapping glibc, and these turn out to be far older than any system which is not EOL. I would not be surprised if our glibc 2.35 formula could even build on CentOS 6, which has been EOL for several years. The longest support cycle for any Linux distro is currently 13 years for SUSE Enterprise Linux but glibc could easily be bootstrapped on systems that are 15+ years old. So portability no longer even needs to factor into our considerations - glibc will work for all actively supported Linux systems.

Strongly agreed with all of this. I think we should aim to have similar C++ support on macOS and Linux. I don't mind if we have a longer build-from-source tail on both but we should be clearer with our installation and runtime messaging when we expect things may break and that people should submit PRs on Linux rather than filing issues (like we do with older macOS versions).

Once this migration is done I am going to work on finishing the CI part of binary prefix patching. Based on the testing I've already done, about 95% of non-relocatable formulae worked with no issues (and only some of the 5% that don't work even use C++), so I think the long tail will actually become very short, making even this concern mostly irrelevant. In the near future I expect that the only difference most Linux users will have if they are on an older system is the presence of the glibc and gcc@11 formulae.

danielnachun avatar Aug 02 '22 03:08 danielnachun

Fortunately there is no tension between portability and bottle coverage now that we better understand the glibc bootstrapping process.

Note: my earlier claim about the tradeoff between portability and bottle coverage is slightly more nuanced.

To rephrase it a bit: there is portability, bottle coverage, and maintainability of Homebrew and homebrew-core. Choose two.

The proposed solution for maintaining both portability and bottle coverage has introduced a lot of complexity into glibc (see https://github.com/Homebrew/homebrew-core/pull/106837), so we're still giving up maintainability for portability and bottle coverage. That's what we've been doing so far, which is why I have been trying to nudge is toward the direction of prioritising maintainability more than we have been.

I've also opened a PR to try to ease the pain from this transition: #13631

carlocab avatar Aug 02 '22 03:08 carlocab

The proposed solution for maintaining both portability and bottle coverage has introduced a lot of complexity into glibc (see Homebrew/homebrew-core#106837), so we're still giving up maintainability for portability and bottle coverage. That's what we've been doing so far, which is why I have been trying to nudge is toward the direction of prioritising maintainability more than we have been.

This is an important point :+1:

MikeMcQuaid avatar Aug 02 '22 09:08 MikeMcQuaid

If the complexity is concentrated in the glibc formula, I'm happy with that. glibc is updated once every 6 months, and there is often no big rush to get it updated to the latest version. It's also: a linux-only formula / a formula that is not necessary if your Linux is new enough (which is the case for a lot of our users, looking at the stats provided by @sjackman).

iMichka avatar Aug 03 '22 11:08 iMichka

Actually glibc only should be updated when the CI image is updated, so it would only get updated every 2 years if we follow the LTS release schedule, making this even less work to maintain.

danielnachun avatar Aug 04 '22 04:08 danielnachun

linux-disribution-barchart-1 glibc-version-barchart-1

On 2022-07-29 I wrote…

The majority of our users are using Ubuntu 20.04 and Glibc 2.31. I suggest that we build bottles using Ubuntu 20.04 and Glibc 2.31. 79% of our users (23% on Glibc 2.35 and 56% on Glibc 2.31) will be able to use their host's Glibc and compiler without needing to installed the brewed Glibc and GCC.

If we use Ubuntu 22.04 and Glibc 2.35 to build bottles, only 23% of our users will have a new enough version of Glibc on their host system to use our bottles, and the rest will need to install the brewed glibc and gcc. A system with a single version of Glibc is less complex and easier to administer than a system with multiple versions of Glibc, which is pretty unusual. I would prefer that a majority of our users be able to use their own host's Glibc and compiler, as we do on macOS.

That said, I'm open to using either Ubuntu 20.04 or Ubuntu 22.04 for building bottles, whichever we settle on. I'd be curious to know how many bottles cannot be built using the compiler provided by Ubuntu 20.04 (GCC 9.4.0), but could be built by Ubuntu 22.04 (GCC 11.2.0).

https://github.com/Homebrew/brew/pull/13625#issuecomment-1205062494 @MikeMcQuaid wrote…

I'd probably like to see something like e.g. we use the latest LTS at least ~3-4 months after release and we always migrate to the latest LTS within at most 12 months of release.

This timeline seems reasonable to me.

sjackman avatar Aug 04 '22 19:08 sjackman

A system with a single version of Glibc is less complex and easier to administer than a system with multiple versions of Glibc, which is pretty unusual.

Can you elaborate on this? I don't want to totally dismiss this concern but I think it would be great to have more concrete examples of how this a problem. Speaking from personal experience, the only times I've seen problems with brewed glibc in my system have been when brewed and non-brewed libraries/binaries get mixed up i.e. linking against a brewed library while using a non-brewed GCC that ends up using the older host glibc. While these problems are unfortunate, there are some pretty serious downsides to using an older GCC to build formulae that may outweigh these concerns

I'd be curious to know how many bottles cannot be built using the compiler provided by Ubuntu 20.04 (GCC 9.4.0), but could be built by Ubuntu 22.04 (GCC 11.2.0).

The even harder question to answer here is: how many bottles will not be able to built with GCC 9 but can be built with GCC 11 in 2 years when we would be preparing to transition to 22.04? It's very difficult to predict this but my concern is that if it gets too large, we will be stuck back at the same issue we've struggled with before. While I strongly oppose completely blocking the use of brewed GCC for any formulae in Linux, I think we should do everything we can to keep the number of formulae that need this as small as possible to maximize maintainability.

danielnachun avatar Aug 05 '22 05:08 danielnachun

If we use Ubuntu 22.04 and Glibc 2.35 to build bottles, only 23% of our users will have a new enough version of Glibc on their host system to use our bottles, and the rest will need to install the brewed glibc and gcc. A system with a single version of Glibc is less complex and easier to administer than a system with multiple versions of Glibc, which is pretty unusual. I would prefer that a majority of our users be able to use their own host's Glibc and compiler, as we do on macOS.

On the flip side, for portable Ruby we optimise for the newest version shipped by a version we support because more and more users will use it over time (particularly with how infrequently we upgrade, same with glibc). That's not to say we definitely should do 22.04: but I think there's a reasonable argument for it.

That said, I'm open to using either Ubuntu 20.04 or Ubuntu 22.04 for building bottles, whichever we settle on. I'd be curious to know how many bottles cannot be built using the compiler provided by Ubuntu 20.04 (GCC 9.4.0), but could be built by Ubuntu 22.04 (GCC 11.2.0).

Same, this data would be useful 👍🏻

I'd probably like to see something like e.g. we use the latest LTS at least ~3-4 months after release and we always migrate to the latest LTS within at most 12 months of release.

This timeline seems reasonable to me.

Thanks @sjackman! I think figuring out a timeline that's both desirable and attainable will help us figure out a strategy that we can stick to in future and communicate to users.

MikeMcQuaid avatar Aug 05 '22 09:08 MikeMcQuaid

A system with a single version of Glibc is less complex and easier to administer than a system with multiple versions of Glibc, which is pretty unusual.

Can you elaborate on this? I don't want to totally dismiss this concern but I think it would be great to have more concrete examples of how this a problem.

Agreed, would be good to expand on this when you get a chance @sjackman.

The even harder question to answer here is: how many bottles will not be able to built with GCC 9 but can be built with GCC 11 in 2 years when we would be preparing to transition to 22.04? It's very difficult to predict this but my concern is that if it gets too large, we will be stuck back at the same issue we've struggled with before.

This is a great point @danielnachun. I think we need not to just think about "where are we today" but "where will be be at the latest point before the next migration"? To me, this nudges me towards the newest version of an Ubuntu LTS (with a couple of months gap to allow it to stabilise).

MikeMcQuaid avatar Aug 05 '22 09:08 MikeMcQuaid

I have an important update on the glibc side of things - after a lot of discussion we've arrived on a very robust and simple solution in https://github.com/Homebrew/homebrew-core/pull/106837 that means we should no longer need to worry about getting glibc to build on even the oldest support Linux distros. So we no longer need to factor portability into our discussions, and I think we can be a lot less concerned about getting glibc installed in non-default prefixes.

danielnachun avatar Aug 15 '22 17:08 danielnachun

The glibc update has shipped. The only other blocker now is https://github.com/Homebrew/brew/pull/13577 and that is almost finished too.

One other detail we need to take care of is making sure we create the symlinks to $HOMEBREW_PREFIX/lib in gcc@11 instead of gcc@5 now that this will be our default compiler. That is the only other PR I can think of that isn't on our to do list.

danielnachun avatar Aug 23 '22 03:08 danielnachun

One other detail we need to take care of is making sure we create the symlinks to $HOMEBREW_PREFIX/lib in gcc@11 instead of gcc@5 now that this will be our default compiler. That is the only other PR I can think of that isn't on our to do list.

@danielnachun Can you tackle that? If not, can you find someone to do so? The next tag is going to be 3.6.0 where we'll make and announce these changes.

MikeMcQuaid avatar Aug 23 '22 10:08 MikeMcQuaid

I can definitely do that, the only question was whether to make it separate PR or part of https://github.com/Homebrew/homebrew-core/pull/108590.

danielnachun avatar Aug 23 '22 17:08 danielnachun

@danielnachun I'm fine either way! If they need to happen at once: doing in the same PR makes sense.

MikeMcQuaid avatar Aug 23 '22 17:08 MikeMcQuaid

The change has been added to https://github.com/Homebrew/homebrew-core/pull/108590. AFAIK there are no other changes I'm aware of that aren't in an existing PR for the migration to happen. I've already opened PRs to drop using [email protected] for zlib and binutils as well though they don't need to be merged for the migration to happen.

danielnachun avatar Aug 23 '22 18:08 danielnachun

Just curious, how does it break if we don't create those symlinks? I think I saw in some chatter that lib/gcc/current is now added to the run time RPATH when pouring bottles. At compile time, GCC ought to be able to find its own .so and .a libraries, because lib/gcc/VERSION ought to be in its default search path, so maybe these symlinks aren't necessary? It's entirely possible they are necessary.

  • https://github.com/Homebrew/brew/pull/13659

sjackman avatar Aug 24 '22 17:08 sjackman

Similarly to how we bumped the brewed Glibc version before building bottles with that Glibc version, perhaps we can also bump the preferred brewed GCC version before building bottles with that GCC version.

sjackman avatar Aug 24 '22 17:08 sjackman

I'll have to test all this locally, but assuming we always install gcc and not gcc@11, it should be okay. Not having those symlinks would actually be great because it would permanently solve the somewhat recurring issue of those links breaking things, and bring us one step closer to dropping the RPATH to $HOMEBREW_PREFIX/lib entirely, which we've all agreed is a good idea in the long run.

danielnachun avatar Aug 24 '22 17:08 danielnachun

Actually, one major issue with this is existing installations - all formulae would have to be reinstalled because they wouldn't have the RPATH. That seems like a pretty huge breakage that we should avoid for now. I think we can eventually have some audits to handle this automatically for the user but we have to figure those out.

danielnachun avatar Aug 24 '22 19:08 danielnachun

I'd like to have a stable release of Homebrew that migrates our users to glibc 2.35 and gcc@11 before we move our bottling infrastructure to Ubuntu 22.04.

  • https://github.com/Homebrew/brew/pull/13752
  • https://github.com/Homebrew/homebrew-core/pull/108816

sjackman avatar Aug 24 '22 21:08 sjackman

I think that makes sense. The actual switchover of the CI infrastructure has no user-facing effects, and switching everyone to glibc 2.35 and gcc@11 is just a regular upgrade for users.

danielnachun avatar Aug 24 '22 21:08 danielnachun

I'd like to have a stable release of Homebrew that migrates our users to glibc 2.35 and gcc@11 before we move our bottling infrastructure to Ubuntu 22.04.

I'd like to move the bottling infrastructure at the same time we release 3.6.0.

MikeMcQuaid avatar Aug 25 '22 11:08 MikeMcQuaid