manylinux icon indicating copy to clipboard operation
manylinux copied to clipboard

EXT_SUFFIX — ppc64le vs powerpc64le

Open frenzymadness opened this issue 3 years ago • 9 comments

Hello.

I am trying to build a very simple project based on python-manylinux-demo for all available architectures and I've discovered a problem on ppc64le.

The problem is that in a wheel produced by the latest manylinux container image (for example simple_manylinux_demo-1.0-cp38-cp38-manylinux2014_ppc64le.whl), I have an compiled extension with the following name: extension.cpython-38-powerpc64le-linux-gnu.so when I believe that the correct name is extension.cpython-38-ppc64le-linux-gnu.so.

We have discovered the problem on a system where sysconfig.get_config_var("EXT_SUFFIX") is '.cpython-38-ppc64le-linux-gnu.so' and therefore Python cannot find the extension with a different architecture name in the file name.

All Pythons in the container image are configured with powerpc64le:

# for PYBIN in /opt/python/*/bin/; do "${PYBIN}/python" -c 'import sysconfig; print(sysconfig.get_config_var("EXT_SUFFIX"))'; done
.cpython-35m-powerpc64le-linux-gnu.so
.cpython-36m-powerpc64le-linux-gnu.so
.cpython-37m-powerpc64le-linux-gnu.so
.cpython-38-powerpc64le-linux-gnu.so
.cpython-39-powerpc64le-linux-gnu.so

I think that they should be configured with '.cpython-XY-ppc64le-linux-gnu.so'. What do you think about it?

frenzymadness avatar Jul 29 '20 08:07 frenzymadness

This is baked-in to the _sysconfigdata*.py file created from the Makefile generated by ./configure. This is made up from the SOABI, and the part in question is the PLATFORM_TRIPLET. This in turn is hard-coded as powerpc64le-linux-gnu. So the question becomes "how did you get ppc64le and not powerpc64le for your python? It seems someone messed with the source code?

mattip avatar Jul 29 '20 12:07 mattip

You are right. It seemed strange to me because manylinux images are based on Centos and my problem occurred on RHEL. The truth is that RHEL/Centos/Fedora have a patch to change powerpc64le → ppc64le but manylinux compiles its own Pythons and uses the default setting.

I can think of two possible solutions here:

  • make Python importlib more universal: basically, allow Python to check multiple different suffixes on import time.
  • make it as an (optional) feature of auditwheel: implement a possibility to create a symlink extension.cpython-38-ppc64le-linux-gnu.soextension.cpython-38-powerpc64le-linux-gnu.so which would make the produced wheels even more universal

What do you think about it?

frenzymadness avatar Jul 30 '20 05:07 frenzymadness

ping @encukou who I think is mentioned in the patch you linked to. Manylinux images are based on CentOS since they had the oldest versions of glibc, but that may change soon, see gh-542. It would be good to understand why such a change was made, and if it can be dealt with by the downstream project before proposing sweeping changes to core python or the pypa projects.

mattip avatar Jul 30 '20 06:07 mattip

The patch predates me, and I wasn't very familiar with it.

Now that I look at it — the downstream patch is wrong. See details below; short story is that we screwed up :( We will try to fix it on the downstream side, and migrate the ecosystem to powerpc64le. Unfortunately, it'll take a while because of backwards compatibility issues. Based on how schedules are lined up, it might take a year.

Please close this; there's nothing to do in manylinux. But if you see any issues around this in the next months, please forward them to me or @frenzymadness.

Details

It would be good to understand why such a change was made

Platform tags for “common” architectures match Fedora's %{_arch} macro, which was used in the list of filenames generated by Python (something like /usr/lib64/python3.9/lib-dynload/_decimal.cpython-39-%{_arch}-linux-gnu.so, for example). For PPC, Python's powerpc64le does not match Fedora's ppc64le. When it was added to the set of the supported architectures, instead of adapting the file list, the configure script was patched.

At the time (2013), that was a reasonable decision: the idea of cross-Linux builds was sci-fi, and Fedora was not trying to stay close to upstream as it is now (we had 59 patches; today we're down to 6). But today, it's a problem.

To fix this without breakage, the current plan is:

  • patch importlib to consider both tags in Fedora 33 (planned release 2020-10)
  • rebuild all software with the new tags in Fedora 34+ (planned release 2021-04)
  • remove the patch again

encukou avatar Jul 30 '20 13:07 encukou

@encukou thanks for the quick reply, the historical context, and for outlining a concrete plan. I don't think many projects are as of yet providing public wheels for ppc64le, so hopefully the timing is not too problematic. At least as far as NumPy is concerned, our current plan is to wait for a PEP-600 manylinux format before releasing ppc64le wheels, which probably won't happen before 2020-10 anyway. For way too much info see numpy/numpy#15763.

mattip avatar Jul 30 '20 14:07 mattip

This problem is not limited to Fedora, i've also seen it on a Red Hat Enterprise Linux 7.7.

I believe, however, that it is possible to fix the issue in a way that does not break backwards-compatibility and is transparent to end users. The solution is confined to pip:

  • when pip installs a binary wheel on a ppc machine, it looks at the sysconfig.get_config_var("EXT_SUFFIX") variable and checks whether that value is on the list of "official" architectures;
  • if it isn't, then pip checks its own list of known translations such as ppc64le -> powerpc64le;
  • if a translation is found, then as pip unpacks files from the binary wheel, it automatically renames all .so files so that their tags match whatever is expected by the host python.

This approach has the advantage that it is much easier for the user to upgrade pip than to upgrade python. Also, pip's rollout cycle is much faster, which means the solution can be available to users much earlier.

st-pasha avatar Jul 30 '20 19:07 st-pasha

@encukou did the powerpc64le/ppc64le _multiarch get changed for newer CPython versions?

mattip avatar Mar 26 '23 10:03 mattip

Fedora Linux 34 changed it https://fedoraproject.org/wiki/Changes/Python_Upstream_Architecture_Names

Our Python 3.9 and older also fallbacks to import files with the old Fedora names, anything newer is using the upstream names only.

hroncok avatar Mar 26 '23 10:03 hroncok

Thanks. xref @isuruf. Then this issue can be closed, right?

mattip avatar Mar 26 '23 11:03 mattip