find_library broken in alpine image
According to http://bugs.alpinelinux.org/issues/5264 its known, that find_library is broken with python3 on Alpine.
However, with python2 it works on pure alpine:3.3,
$ docker run --rm -ti alpine:3.3 sh -c "apk add --no-cache file python && python -c 'from ctypes.util import find_library;print find_library(\"c\")'"
...
libc.musl-x86_64.so.1
but not on python:2-alpine,
$ docker run --rm -ti --entrypoint sh python:2-alpine -c "python -c 'from ctypes.util import find_library;print(find_library(\"c\"))'"
None
so python:2-alpine seems to break something in addition to the general alpine issue, which renders even version 2 broken. Maybe the find_library issue could also be worked around by python image in general.
find_library relies on ldconfig gcc or objdump according to its document. gcc objdump need to be installed (by apk add gcc) separately as they are not included in the docker image, but won't work for some libraries, like libcairo. I've encountered this myself.
ldconfig seems broken according to the author of this bug, but the maintainer didn't say whether ldconfig is broken, instead he provided a patch addressing the issue.
Alpine image may need to apply the same patch to make find_library functional.
@yosifkit @tianon @ncopa
@ushuz ldconfig works just fine. it does not cache anything like gnu libc does so there is nothing to print with ldconfig -p. as you understand they (rightfully) don't trust the ldconfig output parsing so the fall back to using gcc or objdump because the likely thought that there is no way there exist any system without those tools installed.
The patch we use in alpine is incomplete. The fundamental problem is when you try to dlopen libc itself, then bad things will happen. upstream musl libc has said they will fix but I don't know what the status is there.
We should probably report this to upstream python.
@ncopa But why calling ldconfig -p on alpine:latest outputs Illegal option -p?
Twisted server: monkey-patch file Running Twisted (Python 2.7.x) on Alpine Linux 3.7 inside Docker. https://stackoverflow.com/q/48234723/277267
I'm a little confused here -- what's the use case for find_library('c') in the first place? It seems strange to me to be trying to dlopen libc itself in this way -- aren't most of the functions it provides already available through the Python standard library in some other way?
But why calling
ldconfig -ponalpine:latestoutputsIllegal option -p?
As noted above, ldconfig in Alpine is much simpler than it has to be for glibc; here's the entire script contents of /sbin/ldconfig in Alpine:
$ docker run --rm alpine:3.7 cat /sbin/ldconfig
#!/bin/sh
scan_dirs() {
scanelf -qS "$@" | while read SONAME FILE; do
TARGET="${FILE##*/}"
LINK="${FILE%/*}/$SONAME"
case "$FILE" in
/lib/*|/usr/lib/*|/usr/local/lib/*) ;;
*) [ -h "$LINK" -o ! -e "$LINK" ] && ln -sf "$TARGET" "$LINK"
esac
done
return 0
}
# eat ldconfig options
while getopts "nNvXvf:C:r:" opt; do
:
done
shift $(( $OPTIND - 1 ))
[ $# -gt 0 ] && scan_dirs "$@"
@tianon
I'm a little confused here -- what's the use case for find_library('c') in the first place? It seems strange to me to be trying to dlopen libc itself in this way -- aren't most of the functions it provides already available through the Python standard library in some other way?
As mentioned in the SO link above, Twisted's inotify implementation uses find_library('c')
https://github.com/twisted/twisted/blob/6ac66416c0238f403a8dc1d42924fb3ba2a2a686/src/twisted/python/_inotify.py#L106-L110
As noted above, ldconfig in Alpine is much simpler than it has to be for glibc
CPython's stdlib ctypes.util.find_library() depends on ldconfig -p or gcc cc objdump or ld objdump to work properly. But ldconfig -p gcc cc ld objdump are all missing from python:alpine images.
https://github.com/python/cpython/blob/4acc140f8d2c905197362d0ffec545a412ab32a7/Lib/ctypes/util.py#L255-L312
So I tried installing gcc, but the result seems odd.
$ docker run --rm -it python:alpine apk add --no-cache gcc && python -c 'from ctypes.util import find_library;print(find_library("c"))'
fetch http://dl-cdn.alpinelinux.org/alpine/v3.7/main/x86_64/APKINDEX.tar.gz
...
OK: 117 MiB in 55 packages
/usr/lib/libc.dylib
$ docker run --rm -it python:alpine sh -c "apk add --no-cache gcc; python -c 'from ctypes.util import find_library;print(find_library(\"c\"))'"
fetch http://dl-cdn.alpinelinux.org/alpine/v3.7/main/x86_64/APKINDEX.tar.gz
...
OK: 117 MiB in 55 packages
None
alpine maintainers apply the following patch : https://github.com/alpinelinux/aports/blob/master/main/python2/musl-find_library.patch
can you put the same in the image ?
We should probably report this to upstream python.
Did anyone end up doing so? (not seeing a link anywhere in the comments here)
They definitely won't fix it if they don't know it's an issue. :sweat_smile:
It looks like https://bugs.python.org/issue21622 is probably the right place.
To clarify, this is definitely an issue, and definitely something that ought to be fixed, but I'm not keen on the implementation in https://github.com/alpinelinux/aports/blob/202f4bea916b0cf974b38ced96ab8fca0b192e3f/main/python2/musl-find_library.patch especially given that it's very Alpine-specific and doesn't appear to have had any review by the Python maintainers (which I'd want for something that patches the standard library).
I think what's needed here is a patch that Python upstream would be willing to consider and/or merge themselves, and I haven't seen anyone propose anything like that (and there's been no response to my ping on the upstream bug several months ago).
I've submitted some PRs for cpython master, which should backport to 3.7 with minimal fuss considering there are no conflicts with the util.py file. https://github.com/python/cpython/pull/10453 Also for the 2.7 branch https://github.com/python/cpython/pull/10455
In the 3.7 branch, they have the ldconfig, gcc and objdump methods that reference the LD_LIBRARY_PATH to get names. I took the implementation above to walk that path referenced above and removed the alpine specific checks.
My opinion is that the decision for which library name is being sought should be maintained by the library/app based on the detection for the environment it is in or the user of the app/library to configure/override where applicable. Once you know which library name you want to find, and for the posix case, the LD_LIBRARY_PATH can determine where to search which is within the user's control and within the responsibility of the find_library function.
At least for my case, I can use the alpine:3.8 image, set ENV LD_LIBRARY_PATH in my Dockerfile, and make sure I apk add my run-time deps for anything my python apps may need to find using find_library. Logic for calling libc instead of libm, libcrypt, libpthread etc for example, I patch in my apps/libraries as they are the ones that need to be aware of which lib name to search for, then they can rely on the python to find it. Duplicated software for cases like that, yes, but find_library actually does what it sets out to do. Finds a library by the name I give it.
Closing in favor of the upstream tracking issue (https://bugs.python.org/issue21622), since this is really an upstream bug. :sweat_smile: :heart:
In the meantime why can't a patch be made for Alpine specifically? E.g. something like below?
https://github.com/home-assistant/docker-base/blob/ddc5dbcce5de4c91d46f34f4f4e2d3ff57228bcb/python/3.13/musl-find_library.patch
It would help a lot since people wouldn't need to build Python themselves or write a script to symlink libraries to a .so$
Last time I looked at / considered https://github.com/python/cpython/pull/18380, it appeared to be more fundamentally controversial than it is now, so this is indeed probably something we should seriously consider applying and keeping up with (especially given that per PEP 11, Alpine/musl is not a supported platform, so it seems reasonable for us to consider carrying this patch unless/until that's resolved).
@tianon that would be fantastic and thank you for reopening, it's one of those gotchas that people run into and spend hours debugging.
@tianon would you be open to PRs to add this functionality across all the Python versions? I may have some free time to help out.
Yes, but the biggest challenge (that I don't have a simple answer to atm) is going to be finding a clean way to maintain the patch, especially with the necessary changes for different Python versions. Ideally we'd just download it directly from a URL like https://github.com/python/cpython/pull/18380.diff?full_index=1 and validate it with a sha256 (currently bedc83b96b4888bf6540178719ab120a69e61f061dd54c2b00e3a6918f534bf3), accepting that we'll need to manually update that any time the PR has changes, but my own testing shows that's not actually tenable (patch currently doesn't apply as-is on at least Python 3.12, and that URL appears to be subject to more strict rate limits than other GitHub URLs).
Something like a hacky "sed the right source files" type construction is out of the question, and I'd really like to avoid maintaining patch files in this repository directly, so this needs some more thought on approaches for getting/maintaining patches across all supported Python versions before we can commit to it fully.