manylinux icon indicating copy to clipboard operation
manylinux copied to clipboard

libcrypt.so.1 being phased out?

Open dvarrazzo opened this issue 5 years ago • 22 comments

Hello,

Just received https://github.com/psycopg/psycopg2/issues/912, according to which libcrypt.so.1 was deprecated on Fedora 30. This sounds like manylinux wheel packages (both 1 and 2010) are incompatible with the platform.

Is there anything I can, or should, do? Would including libcrypt (maybe 2) in the wheel package be a good idea?

dvarrazzo avatar May 03 '19 21:05 dvarrazzo

In case anyone else is thrown by the name: this isn't libcrypto, which comes from openssl, but rather libcrypt, which is part of glibc. This appears to be one of the rare cases where glibc is intentionally choosing to break backwards compatibility.

Practically speaking, if some systems have libcrypt.so.1 and some have libcrypt.so.2, then manylinux wheels can't assume that either exists. I guess that means that they should be dropped from auditwheel's library whitelist, and that they should be vendored into any wheels that need them. Normally I'd hesitate to vendor part of glibc and ship it to a system that might be using a different glibc, but hopefully the libcrypt functionality is sufficiently self-contained that it will work out okay...?

@zackw might be able to advise us further.

njsmith avatar May 04 '19 04:05 njsmith

It seems Fedora has gone further with this transition than I thought any Linux distribution would seriously consider doing, in the relatively short time it's been since the new "libxcrypt" became a thing. There are actually three versions of the library in play:

A. libcrypt.so.1 built from GNU libc's source code. Hasn't changed much in years and, in particular, hasn't been keeping up with advances in password hashing. About two years ago, me and a couple other glibc developers decided the glibc development process was too conservative and slow for this library, and we began a process of splitting it out.

B. libcrypt.so.1 built from besser82/libxcrypt. This has support for newer password hashing algorithms (bcrypt, scrypt, etc), and maintains compatibility with binaries built against glibc's libcrypt, but binaries built against its headers will not be compatible with glibc's libcrypt, and you cannot compile programs that use certain obsolete functions against it (bigcrypt, fcrypt, encrypt, setkey -- nobody should be using these anymore, since they all inherently involve the use of single DES, but it's still a consideration). Fedora seems to have switched to this library in version 28.

C. libcrypt.so.2 built from besser82/libxcrypt (configure option --disable-obsolete-api). This drops binary backward compatibility with both A and B. Fedora seems to have switched to this in version 30.

I expected Linux distributions to converge on B, not C, but I guess "any program using these functions is necessarily using a block cipher with an unacceptably short key" was enough of a reason for Fedora to break them.

I agree with @njsmith that, as a practical matter, it's no longer appropriate for libcrypt.so.1 to be on the manylinux whitelist. I'm not sure how safe it is to vendor A (glibc's libcrypt); the copy on my computer does refer to some internal-use-only symbols from libc.so.6. Vendoring B or C, however, should be safe; libxcrypt uses only public interfaces, and it shouldn't matter to any program if they get crypt@GLIBC_2.2.5 from B instead of A.

What I'd recommend, therefore, is that the manylinux1 and manylinux2010 base images be modified to supply option C (libcrypt.so.2) as the libcrypt that will be used to build extension modules and vendored into wheels. Specifically, build libxcrypt with --disable-obsolete-api --enable-hashes=all, install it in /usr/local, and then delete the crypt.h, libcrypt.a, and libcrypt.so (but not libcrypt.so.1) installed by glibc in /usr. This will break any extension modules using the obsolete functions, but if that's okay for Fedora I think it should be okay for PyPA as well. Using --enable-hashes=all guarantees backward compatibility with all historical hashed passwords.

If there's a problem building libxcrypt in a CentOS 5 environment, please file a bug report on besser82/libxcrypt and I'll see it gets fixed.

zackw avatar May 04 '19 14:05 zackw

So if I'm understanding this correctly, the steps to be taken are:

  • [x] Remove libcrypt.so.1 from the libraries whitelist for manylinux1 (PEP 513) and manylinux2010 (PEP 571) - https://github.com/python/peps/pull/1124 .
  • [x] Remove it from the profiles in auditwheel - https://github.com/pypa/auditwheel/pull/182
  • [x] Add libcrypt.so.2 to the three existing manylinux docker images for building packages, so wheels can bundle that.
    • [x] manylinux2010_x86_64 - #325
    • [x] manylinux1_x86_64 - #324
    • [x] manylinux1_i686 - #324

takluyver avatar Jul 18 '19 20:07 takluyver

Shouldn't we update the PEPs as well? auditwheel will start reporting previously compliant wheels as non-compliant and given that libcrypt.so.1 is still on the list of allowed libraries in the PEPs, this can create confusion.

lkollar avatar Jul 18 '19 21:07 lkollar

Yup, updating the PEPs is done now: python/peps#1124.

takluyver avatar Jul 19 '19 07:07 takluyver

@takluyver When adding libcrypt.so.2to the manylinux docker images, make sure to verify that it gets used for wheels that call the C function crypt, rather than the libcrypt.so.1 that shipped with CentOS. I am 95% sure that the changes I described above

build libxcrypt with --disable-obsolete-api --enable-hashes=all, install it in /usr/local, and then delete the crypt.h, libcrypt.a, and libcrypt.so (but not libcrypt.so.1) installed by glibc in /usr

will accomplish this, but it still needs testing.

zackw avatar Jul 19 '19 12:07 zackw

To check that, I assume one can run ldd on a relevant binary and check which libcrypt it's linked against?

It would also be good if someone could provide a sample extension module which calls crypt, because it would probably take me a long time to piece together an example.

Finally: anyone reading this, please feel free to work on the manylinux images. I'm about to be offline for a week, and even when I am here, I'm not that hot on anything that involves writing or compiling C code. I wrote the checklist above as what needs doing, not saying that I'm going to do it. :slightly_smiling_face:

takluyver avatar Jul 19 '19 12:07 takluyver

I learned about this issue in EuroPython. After some discussion, we're updating Fedora's pip to Recommend libcrypt.so.1. (“Recommend” means a soft dependency that's installed by default.) This should hide the issue for regular users. Of course, removing it from the manylinux standards is still the way to go.

When testing on Fedora, please make sure you don't have libcrypt.so.1 (use sudo dnf remove libxcrypt-compat).

encukou avatar Jul 19 '19 13:07 encukou

(While I'm not doing systematic testing, I did note somewhere that I haven't come across any wheels which fail to load. So just to confirm: I don't have libcrypt.so.1; removing libxcrypt-compat said "Nothing to do.")

takluyver avatar Jul 19 '19 13:07 takluyver

I submitted PRs for the manylinux images. GCC 8.2 failed to build libxcrypt on the manylinux2010 image so I will have to look into that.

lkollar avatar Jul 19 '19 14:07 lkollar

I'm writing a trivial extension module to test this change with. Would it be it useful for me to put a sdist of this module on PyPI, or will it be enough to have a public git repo? I don't know Fedora well enough to do the actual testing myself.

zackw avatar Jul 19 '19 14:07 zackw

Thanks @lkollar . I've updated the checklist above to point to those PRs.

@zackw - we can probably work with it any form, but if it's easy to put an sdist on PyPI, please do. It may be useful to have a tarball with a stable hash - I'm told that the automatic tarballs from Github tags can change slightly.

takluyver avatar Jul 19 '19 15:07 takluyver

@takluyver OK, it's uploaded: https://pypi.org/project/pyphash/ and/or https://github.com/zackw/pyphash

zackw avatar Jul 19 '19 16:07 zackw

After all of the steps listed above are taken, we should look through PyPI for binary wheels that use libcrypt.so.1 and poke their maintainers to rebuild and re-upload them. (As far as I know, there's no more automatic way to make this happen.)

zackw avatar Jul 19 '19 16:07 zackw

Something similar was suggested on pypa/warehouse#5420. The Warehouse maintainers might be able to say if it's practical to do something like that, and how to find all packages where the latest release has a manylinux wheel.

takluyver avatar Jul 19 '19 16:07 takluyver

I am about to release a new psycopg version. I assume the issue is not fixed, right?

dvarrazzo avatar Oct 20 '19 00:10 dvarrazzo

I see that all PRs linked from the "steps to be taken" comment are merged. If I understand this correctly, it means new wheels should bundle libcrypt.so.2.

Since this happened before the Python 3.8 release, it seems to me that no real-world wheels for Python 3.8 will need libcrypt.so.1. A distro will only need to provide it for Python 3.7 and below.

encukou avatar Oct 20 '19 10:10 encukou

As far as I can see, packages created yesterday with the latest version of manylinux don't include libcrypt:

$ unzip -l ../psycopg2_binary-2.8.4-cp38-cp38-manylinux1_x86_64.whl | grep crypt
  3217584  2019-10-20 00:57   psycopg2/.libs/libcrypto-3a9cf061.so.1.1.1d
   167808  2019-10-20 00:57   psycopg2/.libs/libk5crypto-622ef25b.so.3.1

Packages were created using the quay.io/pypa/manylinux1_x86_64 docker image.

Has the procedure to build packages changed?

dvarrazzo avatar Oct 20 '19 10:10 dvarrazzo

@zackw @njsmith Do you think is it possible, given that we're building libcrypt.so.2 (from libxcrypt) for the manylinux* containers, that we build libcrypt.so.1 (compat version again from libxcrypt) and use that specifically for bundling (hiding it otherwise because of the struct size differences) when the system libcrypt.so.1 is required? (@fweimer and I just chatted about this on IRC) The libxcrypt version of libcrypt.so.1 has no GLIBC_PRIVATE dependencies and is a drop-in replacement for the system libcrypt.so.1. This would allow developers to continue to use parts of OS that require libcrypt.so.1. That would solve some problems I've seen @dralley having (just talked with him today about this impact).

codonell avatar Feb 24 '20 21:02 codonell

@codonell Yes, I think that is a sane configuration. You'd have libcrypt.a and libcrypt.so matching libcrypt.so.2, but include libcrypt.so.1 on the file system for anything that might expect that via DT_NEEDED, and it shouldn't cause problems if auditwheel decides to bundle it.

As libxcrypt upstream I'm actually working on a restructure where we always build both, with all the real code in libcrypt.so.2 and libcrypt.so.1 being a thin wrapper with the compatibility symbols.

zackw avatar Feb 24 '20 21:02 zackw

@zackw,

I was looking into including libcrypt.so.1 from libxcrypt in addition to libcrypt.so.2 in the manylinux images (and development dependencies for the latter). It seems libcrypt.so.1 also exports all symbols from libcrypt.so.2 and, not knowing the internals of libxcrypt, I find that a bit dangerous that a wheel could actually embed both (libcrypt.so.1 through system library dependency + libcrypt.so.2 as direct dependency using dev version) not knowing exactly which symbols would be mapped in linking phase. Are there any recommendation while waiting for the "always build both" approach you mentioned ?

I came up with some ugly patching of libcrypt.map.in file in order to reduce the set of exposed function to the one actually exported by the system library but I'm wondering if this is the way to go.

Hugly patch:

$ sed -r -i 's/XCRYPT_([0-9.])+/-/g;s/(%chain OW_CRYPT_1.0).*/\1/g' lib/libcrypt.map.in
$ ./autogen.sh
$ ./configure --disable-xcrypt-compat-files --enable-obsolete-api=glibc --enable-hashes=all --disable-werror
$ make DESTDIR=$(pwd)/so.1 install
$ nm -g -D --with-symbol-versions --defined-only ./so.1/usr/local/lib/libcrypt.so.1 | awk '{ print $3 }' | sort
crypt@GLIBC_2.2.5
crypt_r@GLIBC_2.2.5
encrypt@GLIBC_2.2.5
encrypt_r@GLIBC_2.2.5
fcrypt@GLIBC_2.2.5
GLIBC_2.2.5@@GLIBC_2.2.5
setkey@GLIBC_2.2.5
setkey_r@GLIBC_2.2.5

mayeut avatar May 08 '20 16:05 mayeut

The newly merged PR should allow to graft libcrypt.so.1. I'm keeping this issue opened as a reminder that libxcrypt will (soon ?) provide both shared object with 1 build which is ultimately what should be in the images.

mayeut avatar May 09 '20 15:05 mayeut