pypi2nix
pypi2nix copied to clipboard
wheels with bundled native binaries - feature request
Packaging of native dynamically loaded libraries inside wheels distributions is not a good way to manage packages, and is therefor not a good match for nix/os. But many python packages that depend on wheels package can in practice be executed from within a nix environment given that the package author has all native libraries available needed to run the application (not including core c/linux libraries like libc.so/ld-linux.so etc) This is of course closed source approach to package distribution and could pose a security risk, for minimum security, each wheels package should come with a RECORD textfile which includes sha-sums of each artifact within the bundle. Example
cv2/.libs/libz-a147dcb0.so.1.2.3,sha256=VwXH3AM7bnoa793tKDw_H0pW-VZos08-FEtM_g_VWVM,87848
We could verify the sha256 sums of the file before doing anything, to provide some minimal security. Tough I'm not an expert on this topic.
To achieve this, we will need to apply a classic hack which is commonly used in nixpkgs, which is to patchelf the binaries. For example this is what is done to the precompiled google chrome library https://github.com/NixOS/nixpkgs/blob/84cf00f98031e93f389f1eb93c4a7374a33cc0a9/pkgs/applications/networking/browsers/google-chrome/default.nix#L128-L131 though in the case of google chrome, all 3rd party dependencies are provided from nixpkgs, which we can't do in the case of wheels binaries (since there's no good way to automatically detect the 3rd party dependencies from a binary alone).
One way to do this would be to assume that all bundled binaries are needed to run the wheel. Often the binaries include library rpath, which we can ignore if we want and instead just flatten the rpath tree.
For each file we would need to patchelf the runpath to be $out/lib as well as place them there, for the nix runtime dynamic linker to detect them. As well as patch the interpreter for the binaries to be able to find the c interpreter belonging to the nix environment.
During build we can verify if the provided wheel binaries can in fact be used. For example when running ldd on the binaries we can see if the linker is able to discover all the binaries needed
Example:
linux-vdso.so.1 (0x00007ffd97f8d000)
libavcodec-4cf96bc1.so.58.65.103 => not found
libavformat-b798543f.so.58.35.101 => not found
libavutil-ac3ec209.so.56.38.100 => not found
libswscale-99a5f1f1.so.5.6.100 => not found
libQtGui-903938cd.so.4.8.7 => not found
libQtTest-1183da5d.so.4.8.7 => not found
libQtCore-ada04e4a.so.4.8.7 => not found
In cases where 1 dependency is not found, we can abort the pypi2nix process and alert the user that this wont be possible, and give a message for why it failed, because some of the missing binary dependencies could be provided to pypi2nix (ex. -E zlib) and being under one stdenv, could fill in the missing dependencies which the package didn't provide.
I'm not sure if ldd is the best tool for this. But we could research before writing stdout parser looking for "not found" lines (which is always a solid plan B).
This feature could be implemented hopefully relatively easily with an extra method(s) to src/pypi2nix/wheel_builder.py.
The methods would be
- Verify the binaries
- patch the interpreter
- patch the rpaths
- verify discoverability of dependencies.
- report errors if found