pyinstaller icon indicating copy to clipboard operation
pyinstaller copied to clipboard

libreadline.so.6 error on Suse12

Open arossert opened this issue 5 years ago • 45 comments

I'm using PyInstaller to pack my application for distribution on Linux servers. To support a large amount of Linux flavors and versions I'm using Docker image with an old glibc version to build and pack the application.

Everything working on Ubuntu/Debian/Rhel/CentOS & Suse11, when I tried to run the application on Suse12 server and got this error:

/bin/sh: /dist/app/libreadline.so.6: no version information available (required by /bin/sh)
/bin/sh: relocation error: /bin/sh: symbol rl_filename_stat_hook, version READLINE_6.3 not defined in file libreadline.so.6 with link time reference

What is this library for?

The libreadline version in the building machine is 6.1-1

(test) root@aed98870a11d:/# dpkg -l | grep libreadline
ii  libreadline-dev                 6.1-1                          GNU readline and history libraries, developm
ii  libreadline6                    6.1-1                          GNU readline and history libraries, run-time
ii  libreadline6-dev                6.1-1                          GNU readline and history libraries, developm

It looks like some of the code is running (I can see some output) but not sure if all aspects are working. I'm still investigating this and will update if I get any new information.

Update: Building a simple "Hello World" executable works without this error so it is probably one of my import that causes this issue.

If I will upgrade readline version before building python will it help?

PyInstaller: 3.6 OS: Linux (Suse12) Python: 3.7.5/2.7.16

arossert avatar Feb 01 '20 19:02 arossert

I'm kinda found a solution for now but not sure how it will impact the running code. By removing the libreadline.so.6 file from the bundled directory the error message disappears.

Is this a good thing to do? how can this impact my application?

arossert avatar Feb 04 '20 10:02 arossert

This can or can't effect this. Sometimes it will, etc. It depends if your app uses it. @htgoebel could you please respond? (Harmut is the Linux guru - I use windows mostly)

Legorooj avatar Feb 29 '20 01:02 Legorooj

@Legorooj thanks for the reply. I still can’t figure it out, I’m building on a machine with readline 6.1 and this is the version that is copied to the dist dir, why it is looking for version 6.3 on another machine?

arossert avatar Mar 01 '20 22:03 arossert

Try the Development branch. It's probably something to do with the bootloader or a DLL/SO etc in the library that uses it.

Legorooj avatar Mar 02 '20 01:03 Legorooj

Thanks @Legorooj, I will try the develop branch and update.

arossert avatar Mar 02 '20 08:03 arossert

@Legorooj I have tried the latest develop version (4.0.dev0+a1f92c6a08) and got the same results. On the target machine, the pre-installed version is libreadline6-6.3 and on the build machine is libreadline6-6.1. If I'm replacing the libreadline.so.6 file on the target machine with the one that comes preinstalled in the bundle directory the error message disappears.

@htgoebel maybe you can help me find out what is going on, I can't find who is using this lib and why it is looking for a newer version if one is already bundled in my app dir.

I want my application to work on as many Linux distros as possible without maintaining multiple builds.

arossert avatar Mar 02 '20 15:03 arossert

@arossert yes you'll need a Linux person - I don't understand Unix executables much. It's probably something to do with the PyInstaller bootloader or the library.

Legorooj avatar Mar 02 '20 22:03 Legorooj

@Legorooj thanks for trying to help. I'm still trying to figure it out and will wait for @htgoebel input.

arossert avatar Mar 03 '20 10:03 arossert

@arossert pointers: --debug=all will debug when running - imports, loading, unpacking etc. Look at build/warn-**.txt. And redirect the output of the command with the --log-level=TRACE to a file - because there'll be 100,000+ lines.

Legorooj avatar Mar 03 '20 10:03 Legorooj

@arrosert you know a bit more Linux then I do, and I just remembered this. See section one - it might help.

Legorooj avatar Mar 09 '20 06:03 Legorooj

@Legorooj Thanks for trying to help with this issue.

I'm already aware of the GLIBC limitation that PyInstaller has, this is why I'm building my application on a docker container with ubuntu10 as the base to have an old GLIBC version.

I have an idea on the root source of this error, I'm building Python from source and on the machine, we have libreadline6.1 so Python is linked to that version. On the target machine, /bin/sh is linked with libreadline6.3 and this is where the error is coming from, probably the bootloader is executing sh and trying to link with libreadline6.1 that sits in the app directory if I remove the file the error disappears (maybe it finds libreadline6.3 that exists on the machine but I can't confirm that).

What I try to understand is if I can just ignore this error or should I remove this so file from the final app directory.

arossert avatar Mar 09 '20 12:03 arossert

@arossert in the section I linked to there's a tool called staticX. Maybe that could work?

Legorooj avatar Mar 09 '20 23:03 Legorooj

@Legorooj Thanks, I struggled before with staticx and couldn't make it work, I gave it another shot today and realized that it only works if PyInstaller is used in one file mode and not in dir mode.

I did try it with one file (I want to use one dir at the end but tried it anyway) and when executing the app on an old machine I get this error:

arossert:~> ./app
FATAL: kernel too old

Probably I need to build on an older kernel and see if it is supported, the issue is getting staticx to work on an old machine...

I will continue to try and find a solution, hopefully, @htgoebel will give his input on this.

arossert avatar Mar 10 '20 09:03 arossert

@arossert when @htgoebel is active he'll be able to help. Nothing more I can do on this specific issue; I'm a windows guy. (For desktop - Linux runs servers.)

Legorooj avatar Mar 10 '20 10:03 Legorooj

I was able to pinpoint the module that causes this error, using keyring is causing this:

import keyring

Building only this reproduces this error, I'm using version 18.0.1, this is the latest that still supports Python2.7.

After doing some more digging using strace I have noticed that if the .so is not there it will use the system one instead.

This is the output when the file exists:

open("/home/app/tls/x86_64/libreadline.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/home/app/tls/libreadline.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/home/app/x86_64/libreadline.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/home/app/libreadline.so.6", O_RDONLY|O_CLOEXEC) = 3

This is when it is not:

open("/home/app/tls/x86_64/libreadline.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/home/app/tls/libreadline.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/home/app/x86_64/libreadline.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/home/app/libreadline.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/lib64/bash/4.3/tls/x86_64/libreadline.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/lib64/bash/4.3/tls/libreadline.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/lib64/bash/4.3/x86_64/libreadline.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/lib64/bash/4.3/libreadline.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/lib64/libreadline.so.6", O_RDONLY|O_CLOEXEC) = 3

So I wonder if this file really needs to be packed in the final bundle if this library should exists in the system.

Maybe it should look first in the system path to find this lib and if not exists look in the app path (not sure if this is even possible to do to a single lib).

arossert avatar Mar 10 '20 14:03 arossert

That's useful to know. Assuming you're not using keyring (directly/indirectly) see if --exclude-module keyring works.

Legorooj avatar Mar 10 '20 22:03 Legorooj

@Legorooj I'm using keyring in my app so I cannot exclude it. I have done some more investigation around this, I have upgraded libreadline.so to version 8 so the file that is packed in my app is libreadline.so.8. When executing the app I do not get this error again and looking at strace output it seems that it loads the system libreadline.so.6, on other Linux systems like Ubuntu this lib is not loaded at all.

This leads me to the conclusion that this lib is not needed in the bundle if it is needed it probably already exists in the system, I might be wrong though. If this is correct, I will be happy to submit a PR to exclude this lib just like it skips libc.so and similar.

@htgoebel your input will be much appreciated.

arossert avatar Mar 11 '20 12:03 arossert

Many thanks for you detailed analysis - which is much better any most other bug reports.

Now, this is a tricky one :-)

  1. The bootloader is not executing sh, thus this can not be the conflict. I assume the error message is a bit confusing - or keyring is executing an external program.
  2. https://github.com/pyinstaller/pyinstaller/issues/4657#issuecomment-597126547 shows that the readline bundled with your app is picked up - which what we want.
  3. You are writing about readline6-6.1, readline6-6.3 and "libreadline.so version 8". What are the respoective .so-file for each of these versions? Esp.: Is readline6-6.3 providing …,so.8?
  4. Are you using one-dir mode or one-file mode? Please try one-dir-mode.
  5. Please (for completeness) try starting your app from another working directory.
  6. Please try whether this error occurs in a simple program when using import readline. If this shows the same issues, we can focus on readline. Otherwise we need to search in keyring.
  7. Please run ldd on all files in the bundle directory and watch for conflicts or other striking points.
  8. If only readline does no show the issue, please run again under strace` control and watch for exec*, spawn*, system and that like.

htgoebel avatar Mar 14 '20 11:03 htgoebel

@htgoebel thanks a lot for your response.

I have investigated it a bit more and have a full strace log of what is happening and will attach the log here.

What I can see is that something is running sh -c uname -p 2> /dev/null (file 6657) execve("/bin/sh", ["sh", "-c", "uname -p 2> /dev/null"], [/* 50 vars */]) = 0

This process is created by the main process (file 6654) and I'm not sure why. clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f8a4ebbe9d0) = 6657

I'm not sure that keyring is responsible for this now, it looks like something that looks for machine information.

Regarding the libreadline version, after installing version 8 it packs libreadline.so.8 in the bundle and the sh that is executed is linked to libreadline.so.6 so it ignores it and finds the one that is in the system.

Maybe you can get a look at the trace files and get more information, let me know if I can help in any way.

trace.6653.txt trace.6654.txt trace.6655.txt trace.6656.txt trace.6657.txt trace.6658.txt

arossert avatar Mar 14 '20 15:03 arossert

@htgoebel I was finally able to locate the root cause of this error, it is not coming from keyring per se, keyring is using platform.system(), after looking at the source code of platform I have found this.

It seems that platform.uname() is using uname -p command to get the processor information, when running this it is using sh -c and in my case sh is linked to another libreadline.so version which causes this error.

This happens both in Python 2 & 3 even when the _syscmd_uname was changed a bit.

I'm not sure if this can be worked around, currently, I have solved it by updating libreadline.so.6 to libreadline.so.8 but it might cause errors in the future.

If you have any idea how we can get around this I'm willing to spend some time on it and submit a PR.

arossert avatar Mar 14 '20 20:03 arossert

I'm seeing the same issue building on Debian 8 and deploying to SUSE 12.2, but the underlying problem isn't tied to these two or the particular library.

libreadline is used by bash, often used as /bin/sh. Anything that invokes libc system() or popen() will trigger it. The PyInstaller documentation suggests changing LD_LIBRARY_PATH prior to invoking external programs. That's not a practical solution unless it is done prior to any import statements or 3rd party code. My example program is:

#!/usr/bin/env python
import uuid

Running Python 2.7.9, this eventually leads to:

8249  execve("/bin/sh", ["sh", "-c", "/sbin/ldconfig -p 2>/dev/null"], [/* 47 vars */] <unfinished ...>

While the error message appears to be harmless, substituting arbitrary libraries via an inherited environment variable can have more significant consequences. https://gms.tf/ld_library_path-considered-harmful.html has some discussion of the topic, as well as suggestions for fixing it.

There's a proof of concept change at https://github.com/jeremykatz/pyinstaller/commit/36e593ee7ca9f1aa9e30dd9e7f56a8eea8bdded8 This needs some cleanup to work with 32 bit systems, as well as any other variability in the name of the dynamic linker.

Another option is to set the rpath on the executable. The difficulty here is the randomized extraction location. Setting rpath to . and changing the working directory to the extraction location works, but opens the application to code injection and linkage failure if it ever changes directories. Using $ORIGIN is possible if the executable is copied into the extraction directory. A symlink wasn't enough in my testing.

jeremykatz avatar Apr 03 '20 22:04 jeremykatz

@jeremykatz Thanks for the reply, this issue with sh being linked to libreadline is still an issue for me to, changing the LD_LIBRARY_PATH before all imports is not a real solution.

I'm willing to work on a solution and submit a PR if you have any idea on how this can be solved.

arossert avatar Apr 04 '20 09:04 arossert

https://github.com/jeremykatz/pyinstaller/commit/288f34cae0997e04087b43992de8c33b70b58562 is my current WIP solution. It avoids the issue with LD_LIBRARY_PATH and child processes, but has a major downside.

/proc/self/exe and /proc/self/cmdline point to the dynamic linker and its arguments. Instead of $ ps | grep myprog listing myprog, it will say something like /lib64/ld-2.29.so --library-path /tmp/_MEIMT9Vji /path/to/myprog

https://linux.die.net/man/7/rtld-audit might work. The workflow I'm imagining has the extractor invocation of the program set LD_AUDIT for its child. The child could then clear the variable prior to handing control over to user code.

jeremykatz avatar Apr 08 '20 21:04 jeremykatz

After a little reading into the GNU dynamic linker, it appears that LD_LIBRARY_PATH is only read during program initialization. Unsetting or changing it after does not impact dlopen within the process. Given a little testing, jeremykatz@6bb0029 appears to be a viable solution. It's also less code than the rtld-audit solution.

I don't know if this works on any of the other systems that use LD_LIBRARY_PATH. The same goes for DYLD_LIBRARY_PATH on macOS, and LIBPATH on AIX.

jeremykatz avatar Apr 28 '20 22:04 jeremykatz

@rokm Would https://github.com/pyinstaller/pyinstaller/issues/4657#issuecomment-611216268 work? I'd have thought that it would have exactly the same problem as just unsetting/restoring os.environ["LD_LIBRARY_PATH"] - things like Gtk launching their internal subprocesses would break?

bwoodsend avatar Jun 14 '22 20:06 bwoodsend

I imagine it would be the same as resetting/restoring LD_LIBRARY_PATH. Which, by the way, can be performed in the python code (at the start of the program) if developer is 100% sure that no other executables were collected from the build system that would require the collected (version of) shared libraries instead of system ones.

Ultimately, it's impossible to cater to launching both collected executables and the system ones on linux (unless build system and target system use the same version of the same distribution).

rokm avatar Jun 14 '22 21:06 rokm

I imagine it would be the same as resetting/restoring LD_LIBRARY_PATH. Which, by the way, can be performed in the python code (at the start of the program) if developer is 100% sure that no other executables were collected from the build system that would require the collected (version of) shared libraries instead of system ones.

Collected executables should set rpath to use $ORIGIN.

jeremykatz avatar Jun 14 '22 21:06 jeremykatz

I imagine it would be the same as resetting/restoring LD_LIBRARY_PATH. Which, by the way, can be performed in the python code (at the start of the program) if developer is 100% sure that no other executables were collected from the build system that would require the collected (version of) shared libraries instead of system ones.

Collected executables should set rpath to use $ORIGIN.

How does that work for onefile builds? I don't think modifying rpath at runtime is feasible...

rokm avatar Jun 14 '22 21:06 rokm

I imagine it would be the same as resetting/restoring LD_LIBRARY_PATH. Which, by the way, can be performed in the python code (at the start of the program) if developer is 100% sure that no other executables were collected from the build system that would require the collected (version of) shared libraries instead of system ones.

Collected executables should set rpath to use $ORIGIN.

How does that work for onefile builds? I don't think modifying rpath at runtime is feasible...

Collected executables are extracted to the temporary directory along with collected libraries, I presume.

$ORIGIN is wherever the executable is run from, ie the temporary extraction directory. Set rpath at the time the bundle is created with a tool such as chrpath.

jeremykatz avatar Jun 14 '22 21:06 jeremykatz

Ah, right, $ORIGIN refers to collected executables' location, not the main executable. Thanks, that does sound like a good approach.

rokm avatar Jun 14 '22 22:06 rokm