crossenv icon indicating copy to clipboard operation
crossenv copied to clipboard

Imports failing with "cannot locate symbol" error for wheels built on Ubuntu, deployed on aarch64 Android

Open sbroberg opened this issue 5 years ago • 2 comments

  • cross-env version: 0.7

Relevant code or config:

With associated cross-py.zip file:

  1. Unzip on ubuntu 18.04
  2. cd cross-py
  3. ./build.sh

This will download an android-ndk, openssl & libffi, build the two libs & then download & build a Linux x64 Python 3.8.3 and an aarch64 Python 3.8.3. It then installs crossenv & creates the venv and uses it to build a numpy wheel for aarch64. All are built for Android API level 29.

cross-py.zip

The result is an install-tar.tgz file, which should be copied to a target platform. I've used both a Qualcom board running Android 10, as well as a Pixel 4 running Android 10 (both API level 29). I've tried connecting directly to both using adb & direct connect, but have had better luck (better tool support) using Termux.

Once on the android box, do:

tar -xf install-arm.tgz 
cd install-arm
source runtime-env.sh # will set paths & install pip
bin/pip3 install ../numpy-1.18.4-cp38-cp38-linux_aarch64.whl 
bin/python3 -m numpy

What happened:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/data/data/com.termux/files/home/install-arm/lib/python3.8/site-packages/numpy/__init__.py", line 142, in <module>
    from . import core
  File "/data/data/com.termux/files/home/install-arm/lib/python3.8/site-packages/numpy/core/__init__.py", line 100, in <module>
    from . import _add_newdocs
  File "/data/data/com.termux/files/home/install-arm/lib/python3.8/site-packages/numpy/core/_add_newdocs.py", line 4441, in <module>
    add_newdoc('numpy.core._multiarray_tests', 'format_float_OSprintf_g',
  File "/data/data/com.termux/files/home/install-arm/lib/python3.8/site-packages/numpy/core/function_base.py", line 506, in add_newdoc
    new = getattr(__import__(place, globals(), {}, [obj]), obj)
ImportError: dlopen failed: cannot locate symbol "tanh" referenced by "/data/data/com.termux/files/home/install-arm/lib/python3.8/site-packages/numpy/core/_multiarray_tests.cpython-38.so"...

Reproduction repository:

See attached zip

Problem description:

Note that the cross-compiled Python executable works fine; pure-python packages can be installed and imported without issue (although occasionally some dependencies need to be installed). The error in question above relates to a function that lives in libm.so, which is in my path, and is also being loaded by python immediately on startup (strace shows me this).

Since working with Termux is an option, the fact that I can't build numpy is not a problem, per se, because I can just "pip install numpy" directly, and the version that is installed in this manner works correctly.

However, what I'm really trying to get working is opencv, and that cannot be installed via the pip repositories. After much effort, I've finally coaxed the opencv/python build systems into producing an aarch64 wheel, but unfortunately, I'm back to the same error (although in this case, it's complaining about a function in liblog.so instead of libm.so). The .so being generated appears to be correct, it's just that it's in an odd state that prevents the python/cython import loader from doing the right thing.

The opencv build is a lot more complicated, and takes very long compared to numpy; if we can come up with a solution that fixes numpy's problem, I can apply it to the opencv build.

Suggested solution:

I'm at my wit's end. As far as I can tell, I'm doing everything "right" but somehow the wheels I'm generating are subtly different from the ones that are in the aarch64 pip repositories.

sbroberg avatar Jun 03 '20 12:06 sbroberg

I was able to reproduce this, and I might have a clue. It appears that _multiarray_tests.cpython-38.so uses a function in libm.so, but does not actually declare a dependency on that library. (Verify with readelf -d <path/to/file.so> and look for DT_NEEDED entries.) If we run the last line as:

$ LD_PRELOAD=/system/lib64/libm.so bin/python3 -m numpy

Then I get the error:

ModuleNotFoundError: No module named '_ctypes'

which is at least a different error, and reasonable if your Python build didn't include ctypes' dependencies. (I didn't look closely at your setup.)

While in the worst case, setting LD_PRELOAD might be a workable hack. I think you can specify extra libraries that numpy needs with a site.cfg. See here for a reference. I have an example that I've used to cross-compile numpy and scipy in the past that might also help.

Unfortunately, those fixes are very specific to numpy (I've complained about that before, too), so I don't know if site.cfg stuff will be much help for your ultimate goal of getting opencv to work. In both cases, it does sound like a link flag needs to be injected into the build somewhere, -lm for numpy, and -llog for opencv.

Hopefully this can help you make some progress.

benfogle avatar Jun 03 '20 19:06 benfogle

Verify with readelf -d <path/to/file.so> and look for DT_NEEDED entries.

Hi, a wonderfull tool for debugging - and also (temporary) fixing - those annoyances is https://github.com/NixOS/patchelf

pmp-p avatar Mar 06 '21 07:03 pmp-p