Update glibcLocales more often?
Background
With every new Unicode release, the glibc locales have to be updated upstream to include the new character information (e.g. for Unicode 15: https://sourceware.org/git/?p=glibc.git;a=commit;h=7fe6734d28feb18acb3c50b13a5f5a52f66d39cf). Some ncurses programs rely on this information to display characters properly (see e.g. https://github.com/weechat/weechat/issues/79).
It would be nice to be able to use the new locale data version as soon as it's available, but currently glibcLocales and glibc share the same source so this is tied to updating glibc, which may take a while (the current PR is still a draft, and updates to a version that still does not include the mentioned commit).
Questions
Would it make sense to detach the source of glibcLocales from glibc, allowing us to update it more often? Could problems arise from it being out of sync with glibc?
I'm a bit confused by the glibc derivation: ~why do we use master instead of sticking to the latest release~ (found the answer to this one)? Why do we ship the diff from latest release to master in nixpkgs? Is there really no simple way to fetch a given revision from git without running into bootstrapping issues?
cc @vcunat @trofi @Ma27
It's already happened that the format of the locale archive changed incompatibly between glibc versions. But I expect that to be rare. Still, I'm personally not motivated by using brand-new unicode characters...
It's already happened that the format of the locale archive changed incompatibly between glibc versions. But I expect that to be rare.
… and detectable, so the update can be quickly delayed as currently unusable.
It would be nice to be able to use the new locale data version as soon as it's available, but currently glibcLocales and glibc share the same source so this is tied to updating glibc, which may take a while (the https://github.com/NixOS/nixpkgs/pull/188492 is still a draft, and updates to a version that still does not include the mentioned commit).
I suggest asking upstream to backport unicode changes at least to 2.36 release branch (and possibly to 2.35 release branch). Then it would be natural for distributions to update to it sooner.
I would say using out-of-sync localedata/ primarily causes surprise to users when locale behaviour changes (for better or worse). Currently upstream ties these potentially breaking changes to next major release and that is somewhat expected for users. Shipping these changes earlier might cause hard to track down surprises.
On top of that localedata/ is not enough touse separately. locale/ would like to have to go in sync and that sometimes relies on other glibc internals.Sounds a bit fragile to use it safely.
We should upgrade glibc more frequently :)