libc
libc copied to clipboard
Upgrade musl supported version to 1.2.3
Rebase of #2088.
This PR changes the supported version of musl libc to 1.2.3. The bulk of the changes in this PR are a result of time_t changes to support 64-bit time on all platforms (the time64 change). This is an API breaking change because of an additional private padding field which is added on some platforms.
This change has been run through crater several times. The last run reported 959 regressions, of which upgrading to a newer major or minor version of a dependency would resolve the regression for >700 crates.
A breakdown of which targets contain this breaking change and what Target Tier they are is available in the compiler team MCP to upgrade the *-linux-musl targets to musl 1.2.
Thanks to @kaniini, @richfelker, @danielframpton and @Amanieu for their work on this!
Fixes #1848 Closes #2088
r? @JohnTitor
(rustbot has picked a reviewer for you, use r? to override)
This is technically a breaking change due to the change to time_t, and quite a significant change. However, it's something that will have to be done anyways as we follow upstream musl.
@rfcbot fcp merge
FCP has to be run on repos that enable @rfcbot.
This also affects the library users, I guess we have to release this as 0.3.
Team member @Amanieu has proposed to merge this. The next step is review by the rest of the tagged team members:
- [x] @Amanieu
- [ ] @Dylan-DPC
- [ ] @Mark-Simulacrum
- [x] @cuviper
- [ ] @joshtriplett
- [x] @m-ou-se
- [ ] @pietroalbini
- [ ] @the8472
- [x] @tmandry
No concerns currently listed.
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!
See this document for info about what commands tagged team members can give me.
@rfcbot reviewed
This also affects the library users, I guess we have to release this as 0.3.
Strictly, yes, but I think we can exercise leeway here -- it only affects a subset of targets, and those targets made the same breaking change at their root.
Let me make sure I understand:
- When building on affected targets (where the 3rd column is "yes" in this table: https://github.com/rust-lang/compiler-team/issues/572),
- Updating the Rust compiler
- Will require you to install/build against a newer version of musl
- Which if you happen to use the libc crate, will also require an upgrade (it must use the same musl version as the toolchain)
- Which itself may be API incompatible with your other dependencies that also use the libc crate
The last two are what is being referred to here: "of which upgrading to a newer major or minor version of a dependency would resolve the regression for >700 crates"
Is that right?
Strictly, yes, but I think we can exercise leeway here -- it only affects a subset of targets, and those targets made the same breaking change at their root.
Fair enough! But we have some more breakage candidates and I guess we could use this opportunity to ship them together. The problem is that we don't have any policy about supported toolchain/OS/etc. versions and it may make the 0.3 release painful. Any thoughts on this?
@tmandry It sounds right to me except for the last one. @wesleywiser I'm not sure about "which upgrading to a newer major or minor version of a dependency would resolve the regression for >700 crates" and the report is now unavailable for me. Does it mean releasing this change and upgrading the libc dependency will resolve regressions?
Great questions! I think I can answer both @tmandry and @JohnTitor's by explaining a bit more how the libc crate and the musl targets interact.
The libc crate is a bit unique in the space of -sys crates in that what library it binds to is chosen by your target rather than using feature flags or even just picking a different library. That means there is a relationship between the targets we ship with Rust and what the libc crate should provide bindings to. This change is really the first step to upgrading the version of musl our -linux-musl targets use. In order to do that though, we have to have this support in the libc crate because the standard library itself uses this crate for various syscalls.
The -linux-musl targets themselves vary a bit as some, such as x86_64-unknown-linux-musl, statically link by default and ship with our own build of musl while other targets dynamically link by default and you can provide your own musl version. It's possible today to use your own version of musl (including musl 1.2) but for any target affected by the time64 changes, you run the risk of mismatched bindings.
The main change that has caused issues in the ecosystem is that timespec now contains padding on some platforms (most notably i686). You can see this in the musl definition for the type and the corresponding definition in these changes. As a result of adding the padding field, code that used to initialize like a libc::timespec like this timespec { tv_sec: 0, tv_nsec: 0 } no longer compiles because of the private padding field. This usually manifests itself in dependencies that sit between libc and "user code" such as time, nix, parking_lot, polling, filetime, etc no longer compiling.
I've been working with maintainers for the most widely used crates to fix these issues and ship new releases. Many have even shipped point releases for older versions that are still widely used but there is a long tail of downstream crates that are on ancient versions of some of these dependencies and haven't been touched in years. These are the crates I'm referring to when I say "upgrading to a newer major or minor version of a dependency {such as time, nix, parking_lot, etc} would resolve the regression".
My overall plan here is to:
- Merge these changes into
libcand cut a new release. - Update my PR to upgrade Rust targets to musl 1.2.3 to use the new
libcrelease. - Start an FCP to merge that PR (the compiler team has already approved the MCP to do this).
- Write a post for the Rust blog describing what is changing and the timeline when the target changes are expected to hit stable.
- Merge the target changes PR.
What will happen if a user upgrades the libc crate between when we cut a new release and the changes to the Rust targets hit stable?
Also, what happens when there's an incompatibility between the libc crate and the actual version of musl (when the user supplies their own)? Is there at least a linker error, or is it silent UB?
Can we detect the difference in the build script's is_musl_time64_abi?
What will happen if a user upgrades the libc crate between when we cut a new release and the changes to the Rust targets hit stable?
Also, what happens when there's an incompatibility between the libc crate and the actual version of musl (when the user supplies their own)? Is there at least a linker error, or is it silent UB?
From looking at the builds I have from crater, it seems (nearly?) all of the affected functions are new in musl 1.2. As a result, I think most any scenario where a newer version of the libc crate is being used with older targets will result in linking errors (but perhaps only if the affected functions are used). In the reverse situation (a newer target with an older libc), the symbols will line up with correct types so it should compile, link and execute fine (with no UB).
So having written that out, I think my plan has the wrong order: we should ship the updated Rust targets first and then land this change to libc. However, I think this FCP is still useful for confirming that we are going to ship these changes, we'll just hold off merging them until the updates to the targets have reached stable.
Can we detect the difference in the build script's is_musl_time64_abi?
This should be possible by detecting the presence of the *_time64 functions.
Okay so just to confirm, we're choosing to make a breaking change to the libc crate on these targets rather than add a second variant of all the affected APIs with a different name. (Sorry if you said that and I missed it.) Users won't be affected until they upgrade the libc crate, at which point hopefully all the affected ecosystem crates can be upgraded to fix any errors.
Seems ok to me. Now I wish we could mark structs as #[non_exhaustive] to prevent future breakage, though that would be a breaking change :).
Fair enough! But we have some more breakage candidates and I guess we could use this opportunity to ship them together. The problem is that we don't have any policy about supported toolchain/OS/etc. versions and it may make the 0.3 release painful. Any thoughts on this?
You mean there's no policy on supported platforms for the libc crate in particular? Procedurally, who/what team would be responsible for deciding that?
@rfcbot reviewed
You mean there's no policy on supported platforms for the libc crate in particular?
Yes, e.g. https://github.com/rust-lang/libc/issues/1412. Since we have a lot of supported env, I guess it's hard to make a policy. 🤔
Procedurally, who/what team would be responsible for deciding that?
In the past the libc team took care of such a topic but it's now unified with the crate-maintainers team. Since there is an overlap between the members of this team and the libs, and since the libc is somewhere between the community crate and the internal crate, I feel it makes some sense to ask here incidentally.
As a result of adding the padding field, code that used to initialize like a
libc::timespeclike thistimespec { tv_sec: 0, tv_nsec: 0 }no longer compiles because of the private padding field.
If the unnamed fields RFC were implemented we could add an anonymous wrapper struct with a repr to add extra alignment without introducing padding fields. That way libs could keep using the struct initialization syntax.
struct timespec as a whole should still only have 4-byte alignment on 32-bit though, despite padding that makes it look like 8-byte alignment. Plus, on big-endian that padding comes between the two fields, pushing tv_nsec to occupy the same least-significant bits that it would in the 64-bit version, and you can't do that with alignment.
This is not just musl being weird on its own -- glibc did the same thing, but only in the new struct __timespec64.
https://sourceware.org/git/?p=glibc.git;a=blob;f=include/struct___timespec64.h;h=9abb25c8f78813c48ae183e1dcf072f3b28b5c6d;hb=HEAD
@rfcbot reviewed
This is not just musl being weird on its own -- glibc did the same thing
To add some context on why, it was necessary to produce a structure that can be directly read and written by the kernel, since the kernel folks wanted to be able to use the same code paths for 32-bit and 64-bit syscalls on archs which have both. Initially they actually wanted to make tv_nsec be a 64-bit field on both, but that conflicts with intersecting requirements of the C language making the type of tv_nsec be long and the ABI definition of long as 32-bit on 32-bit targets.
We discussed this in the T-libs meeting today. We think that it would be better to do this as part of a libc 0.3 major release. This would be developed in a branch until it is ready for release. In any case this change is blocked on rustc updating the version of musl shipped with rustup, at which point libc 0.3 would take that version as MSRV (the time64 ABI functions it links to don't exist on musl 1.1).
We also discussed the possibility of using the semver trick to reduce breakage, but this was deemed to be too difficult to maintain for a crate with such a large API surface. Releasing libc 0.3 should cause much less breakage in practice than the 0.1 -> 0.2 libcpocalyse (which was due to * versions in Cargo.toml).
This is also a good opportunity to work through the breakage candidates and assemble a wishlist of breaking changes that we would want to make to improve libc's API and make its maintenance easier.
@Amanieu Should we cancel the FCP then?
@rfcbot fcp cancel
@Amanieu proposal cancelled.
:umbrella: The latest upstream changes (presumably #3138) made this pull request unmergeable. Please resolve the merge conflicts.
@Amanieu that seems like a reasonable approach to me. My impression from reading other issues that mentioned the "libcpocalyse" was that a 0.3 was fairly unlikely. It sounds to me like that may no longer be the case? Is there any kind of known timeline for when a 0.3 release might happen?
I was under the impression this would be a soon-upcoming thing the change would be rolled into. If this is tabling it long-term, that's extremely disappointing. Y2038 is rapidly approaching and programs need to be able to deal with timestamps for things like validity periods now. If there is any breakage from application code making wrong assumptions, dealing with that now will be a lot better than dealing with it when it becomes progressively more urgent.
At the very least, libc 0.3 would only be possible once https://github.com/rust-lang/rust/pull/102891 is merged, and that would become the MSRV for libc 0.3 (since it requires rust to ship musl 1.2).
The best way to start would be to create a tracking issue for libc 0.3 with a list of all the breaking changes we would like to make. Once we have decided what changes we want and implemented them in a branch then we can go ahead and just release it.
The original libcpocalypse was due to crates using * version dependencies, which are now banned from crates.io. So the impact should be much less severe this time. There will be no immediate breakage, instead crates will incrementally upgrade to libc 0.3 as their dependencies do so.
triage: https://github.com/rust-lang/rust/pull/107129 has been merged and was released as 1.71. Can we set 0.3's MSRV as 1.71 and restart an FCP?
@rust-lang/libs Could we restart FCP or are there any other blockers?
If there's consensus to bump the minimum supported version of musl to 1.2.x, I'm happy to pick this back up and get it ready to merge.
@rfcbot cancel