rules_python
rules_python copied to clipboard
PyQt6 .dll/.so load fails on Windows 10 and Ubuntu
🐞 bug report
Affected Rule
I don't know. This is related to packaging the .dll
s and .so
s of python requirements. The most similar looking issue is: https://github.com/bazelbuild/rules_python/issues/381
Is this a regression?
I don't think so. I tested against 0.3.0
and 0.2.0
Description
Running the PyQt6
hello world example on Windows 10 fails due to being unable to find a PyQt .dll
. It also fails on Ubuntu, unable to find .so
files.
🔬 Minimal Reproduction
https://github.com/driftregion/rules_python_pyqt_bug_example
This issue doesn't affect all python packages which use dynamic libraries. I have confirmed that pygame works. I have confirmed it is actually using the .dll
s packaged by rules_python
by removing them at runtime but before import time and observing the same error thrown by PyQt
when a .dll
is unfindable.
https://github.com/driftregion/rules_python_pyqt_bug_example/tree/pygame
🔥 Exception or Error
Windows
$ bazel run ///main
INFO: Analyzed target //main:main (4 packages loaded, 1654 targets configured).
INFO: Found 1 target...
Target //main:main up-to-date:
bazel-bin/main/main.exe
bazel-bin/main/main.zip
INFO: Elapsed time: 18.988s, Critical Path: 5.67s
INFO: 4 processes: 3 internal, 1 local.
INFO: Build completed successfully, 4 total actions
INFO: Build completed successfully, 4 total actions
Traceback (most recent call last):
File "\\?\C:\Users\DELL\AppData\Local\Temp\Bazel.runfiles_xuoazixy\runfiles\poc\main\main.py", line 1, in
from PyQt6.QtWidgets import QApplication, QLabel
ImportError: DLL load failed while importing QtWidgets: The specified module could not be found.
Ubuntu
bazel run //main
INFO: Analyzed target //main:main (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //main:main up-to-date:
bazel-bin/main/main
INFO: Elapsed time: 0.092s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
INFO: Build completed successfully, 1 total action
Traceback (most recent call last):
File "/home/user/.cache/bazel/_bazel_user/7fdfd63b75551a643294e497a1b5ea03/execroot/poc/bazel-out/k8-fastbuild/bin/main/main.runfiles/poc/main/main.py", line 1, in
from PyQt6.QtWidgets import QApplication, QLabel
ImportError: libQt6Widgets.so.6: cannot open shared object file: No such file or directory
🌍 Your Environment
Operating System:
Windows 10
Ubuntu 21.04
Output of bazel version
:
Windows
Build label: 5.0.0-pre.20210623.2
Build target: bazel-out/x64_windows-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Fri Jul 2 14:21:47 2021 (1625235707)
Build timestamp: 1625235707
Build timestamp as int: 1625235707
Ubuntu
bazel version
Build label: 4.0.0
Build target: bazel-out/k8-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Thu Jan 21 07:33:24 2021 (1611214404)
Build timestamp: 1611214404
Build timestamp as int: 1611214404
Rules_python version:
0.3.0
0.2.0
I ran into the same issue with PyQt5.
I found one solution that uses a modified version of https://github.com/ali5h/rules_pip/blob/master/src/whl.py, but the AGPL license didn't work for my project.
I dug deeper with rules_python
and made some progress on Linux. PyQt5 installs PyQt5-Qt5 as a dependency. This package contains the libQt*
shared libraries that the PyQt5 Qt*.so
files should link to. In my case, QtCore was linking to a system installed version of libQt
ImportError: /usr/lib/x86_64-linux-gnu/libQt5Core.so.5: version `Qt_5.15' not found (required by external/host_deps/pypi__pyqt5/PyQt5/QtCore.abi3.so)
In Bazel 3.x, I could use --run_under="LD_LIBARARY_PATH=/home/full/path/to/bazel-repo-name/external/host_deps/pypi__pyqt5_qt5/PyQt5/Qt5/lib"
After I updated to Bazel 4.x, the py_binary
includes an env
attribute so I was able to use a relative path to that directory.
py_binary(
---
env = {
"LD_LIBRARY_PATH": "external/host_deps/pypi__pyqt5_qt5/PyQt5/Qt5/lib",
},
deps = [
requirement("PyQt5"),
],
)
I wish there was a better way to do this automatically. Pip installs all of the PyQt5 dependencies to the same directory so this extra redirection for library path isn't needed
On Windows, hopefully doing something similar with the PATH
environment variable can help you out.
Turns out that this solution doesn't work on macOS because of System Integrity Protection that disables the use of DYLD_LIBRARY_PATH
.
I was able to manually set an RPATH for QtCore.abi3.so
using install_name_tool
that pointed to pypi__pyqt5_qt5/PyQt5/Qt5/lib
. Bazel has a cc_
wrapper to do this on macOS (https://github.com/bazelbuild/bazel/blob/master/tools/cpp/osx_cc_wrapper.sh.tpl).
I could also copy the contents of pypi__pyqt5_qt5
and pypi__pyqt5_sip
into pypi__pyqt5
. This emulates what how native pip installs to site_packages
. Is there anyway to get pip_install
to do this?
I could also copy the contents of
pypi__pyqt5_qt5
andpypi__pyqt5_sip
intopypi__pyqt5
.
Is this why the dlib resolving happens correctly outside of Bazel? Seems so, you say: "Pip installs all of the PyQt5 dependencies to the same directory so this extra redirection for library path isn't needed".
I'd want to work out why PyQt6 in particular has this problem. Is its lib resolving logic non-hermetic, or making some assumption that's invalid in the Bazel context? Is the assumption something that's specified by some official Python packaging specification?
I was able to manually set an
RPATH
for...
Would be good to have this workaround documented in this thread in case others arrive here with the same issue.
Is this why the dlib resolving happens correctly outside of Bazel?
Yes. This problem doesn't seem isolated to PyQt, but any package that ships libraries in this way. i.e. multiple modules can be added as dependencies, but they include dlibs with relative path information embedded into them that relies on other modules to be installed in the same root directory. I ran into this issue with pyobjc in addition to PyQt.
I was able to manually set an
RPATH
for...
After the first bazel build
to download hostdeps into the bazel-<workspace_name>
directory, I ran this command for .abi3.so
files:
$ install_name_tool bazel-<workspace_name>/external/host_deps/pypi__pyqt5/PyQt5/QtCore.abi3.so -add_rpath "/Users/<username>/<path_to_repo>/bazel-<workspace_name>/external/host_deps/pypi__pyqt5_sip/PyQt5"
I reconstructed this command from the osx_cc_wrapper
script linked above, but never did bazel integration.
I was able to temporarily unblock my development by moving PyQt from a requirements.txt
to a bootstrap/provision script that will install the modules globally.
This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days. Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_python!
This issue was automatically closed because it went 30 days without a reply since it was labeled "Can Close?"
I missed that this issue was marked stale. As far as I know, this issue doesn't have a solution. Can it be reopened?
+1, I met the same issue. I believe this is not a good solution. Do we have any progress on this?
Should a new bug be opened to continue discussion about the more general issue of using pip_parse
with requirements that depend on each other being installed in to the same directory? Or should discussion continue here?
As pointed out by several people above, the underlying issue doesn't really have anything to do with the LD_LIBRARY_PATH
being set incorrectly. This is just one of the many errors that one has to troubleshoot in order to develop a system using the current rules_python
to deal with Python packages (like pyqt6
and pyqt6_qt6
) that both install output into the "same" directory.
Because it was hard for me to parse the conversation above without just recreating the environment myself, I'll include a simple visual example.
Let's pretend for simplicity that pyqt6
simply installs a single module PyQt6.stuff1
and a shared library libx.so
. Similarly, let's pretend that pyqt6_qt6
installs the module PyQt6.stuff2
and the shared library liby.so
.
Then after a pip install pyqt6 pyqt6_qt6
, you would expect your site packages to look like:
$SITE_PACKAGES/
└── PyQt6
├── libx.so
├── liby.so
├── stuff1.py
└── stuff2.py
however, since pip_parse
downloads each PyPI package into its own sandbox, the runfiles of a py_binary
that deps
on requirement("pyqt6")
and requirement("pyqt6_qt6")
will instead look like:
$RUNFILES/
├── pip_pyqt6/
│ └── site-packages/
│ └── PyQt6/
│ ├── libx.so
│ └── stuff1.py
└── pip_pyqt6_qt6/
└── site-packages/
└── PyQt6/
├── liby.so
└── stuff2.py
There are many very common assumptions used in "standard" Python that can easily be broken by this change. In order from probably most well-known to least well-known IME:
- Relative imports will be broken only in Bazel:
-
stuff1.py
will fail when attempting tofrom . import stuff2
-
- Only modules in one of the
PyQt6
folders can be found at once.- In the example above,
import PyQt6.stuff2
with currently fail outright, since when looking forPyQt6
on thePYTHONPATH
, Python will find$RUNFILES/pip_pyqt6/site-packages
before$RUNFILES/pip_pyqt6_qt6/site-packages
and assume that is the "one true source" ofPyQt6
. - Note that which one is found is
PYTHONPATH
-order-dependent.
- In the example above,
-
libx.so
will not be able to "see"liby.so
(or vice-versa) the original description of this issue- While this can be solved by manually patching the
rpath
of the libraries for this simple case, this is really working against the fact that the package maintainers have typically already done the hard work to manually set therpath
to work correctly....assuming the libraries are installed into the same folder. - For example, if there was another
pip_pyqt6_qt6/site-packages/PyQt6/Qt6/lib/*.so
with theirrpath
already set to../..
, then it feels very silly to have to set it to../../../../pip_pyqt6/site-packages/PyQt6
purely for Bazel.
- While this can be solved by manually patching the
Clearly, there are several approaches to knock out this problem in the general case, depending on how rules_python
wants to "think about" how these packages should be treated, and how to expose these "combined" packages in the @pip_deps//:requirements.bzl
API.
The first solution one might think of would be to just check at pip_parse
time whether or not any of the deps
of a package contain any top-level site-packages
directories in common with the package itself.
This would solve the particular case of PyQt6
, but we cannot assume that packages of this kind will necessarily deps
on each other. For example, many highly-used libraries (Napari, Sphinx, etc.) use a similar installation mechanism under the hood for first-party extensions. In this case, you would want to be able to specify just the first-party extensions that are being used as deps
, e.g.,
requirements("napari"),
requirements("napari-extension-1"),
...
requirements("napari-extension-N"),
This specification would happen in a BUILD
file, meaning to give users the flexibility that they ultimately will want, we cannot create separate repositories at pip_parse
time, but must instead create a regular build rule that combines the site-packages
folders of each of these extensions into a single py_library
.
Note that we cannot combine multiple @pip_*//:pkg
library targets into a single py_library
because this would lead to the original pip_parse
-based "uncombined" versions of the packages being on the PYTHONPATH
.
If I understand the current setup of rules_python
correctly, this should be as simple as
- adding a
site_packages
macro (analogous torequirement
,entry_point
, etc). This macro would point to - a new file group defined in the private BUILD file template,
- a new rule
multi_source_package
(not sure what a good name would be here) whose only (new) attrs would need to be alabel_list
meant to take a list ofsite_packages()
and create a combinedpy_library
rule.
Is there interest in supporting this (quite common) use-case? I know that my company (and several others) all solve this by simply running our own PyPI repo or constructing these "multi-source" artifacts out-of-band, but the overhead to do this for a smaller or newer company is quite high, and this seems like an easy-enough quality of life improvement that would acknowledge how Python is really used in the wild without exposing any significant new API surface (just the very simple site_packages
and multi_source_package
macros).
I am reopening to keep the discussion context from above in one place. Thanks @bruno-digitbio for a nice write up. Macros that live in requirements.bzl are somewhat hard to maintain and they do increase the API surface in a way that may be difficult to deal with later. If I understand correctly, rules_py might be unaffected by this issue, because they are creating venv like folder structures for the final targets and maybe exploring that direction in rules_python would be good. However I cannot say for sure without a deeper look into this. That said I have been hit by some of the side effects in the past as well, see #1242.
Thanks for the quick response and feedback!
Macros that live in requirements.bzl are somewhat hard to maintain
After some squinting, I can see how adding a macro to a dynamically-generated .bzl
file actually would have some sneaky maintenance costs. That said, I had not realized that we now officially condone using @{name}_{pkg}//:pkg
directly. Would adding a filegroup
like
filegroup(
name = "{site_packages_label}",
srcs = glob(["site-packages/**"], exclude=["site-packages/*.dist-info/**"], allow_empty = True),
)
to the build file generation code be any better?
rules_py ... are creating venv like folder structures for the final targets
My first thought had actually also been to propose that full secondary venv-like layer should sit above the current rules_python
, but it felt like this might involve a rework of the py_library
implementation itself, which I assumed was much less open to change than the repository rules, but sounds like I might have guessed wrong here?
If both are on the table, then I think I would agree that modifying py_library
to output a more venv
-like directory structure (i.e., a "single" site-packages
instead of one-per-external-dep) is a cleaner solution than adding new utilities just to work around the fact that currently each external dependency is isolated.
I personally think we have to move carefully to not break existing usecases but at the same time solve the actual problem we are having and I think the friction of "Everything assumes to be in a venv like layout" with "we can have each package in a separate dir and just manipulate env vars" is there and we just have to be careful that any API extensions moves us closer to the end goal.
As you mentioned previously, bigger companies that are more invested in Python bazel tooling have workarounds and it would be interesting to learn from these experiences - knowing what does and does not work if we go with the filegroup approach when attempting to change rules_python itself would be super useful.
This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days. Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_python!
This issue was automatically closed because it went 30 days without a reply since it was labeled "Can Close?"