TileDB-VCF icon indicating copy to clipboard operation
TileDB-VCF copied to clipboard

The new scikit-build-core setup copies external shared objects into Python wheel

Open jdblischak opened this issue 2 months ago • 4 comments

There are various situations where we want to be able to build tiledbvcf-py against an existing external libtiledbvcf.so:

  • In the conda recipes, we build a separate conda binary for libtiledbvcf. Thus we don't want a redundant shared object stored in the Python conda binary
  • In my nightly setup, I build libtiledbvcf separately from tiledbvcf-py. When I run the tiledbvcf-py tests, I want to know it is using the external libtiledbvcf and not the one copied into the Python package

This is the same situation that we previously addressed for tiledbsoma-py in https://github.com/single-cell-data/TileDB-SOMA/pull/1937 and https://github.com/single-cell-data/TileDB-SOMA/pull/2221. Unfortunately tiledbsoma-py uses setup.py, so I can't directly apply the previous solution to the scikit-build-core setup we are now using for tiledbvcf-py.

I think two things need to happen:

  1. Do not copy shared objects into wheel
  2. Do not edit RUNPATH (so that libtiledbvcf.cpython-3XX-x86_64-linux-gnu.so can still find the external libtiledbvcf.so at runtime)

Here is a reprex to demonstrate the current shared object copying behavior:

docker run --rm -it ubuntu:22.04

# Setup
apt-get update
apt-get install --yes git python-is-python3 python3 python3-pip python3-venv unzip wget
cd
mkdir downloads install-libtiledb install-libtiledbvcf

# Install nightly binaries
the_date=2024-04-30
wget --quiet -P downloads/ \
  https://github.com/jdblischak/centralized-tiledb-nightlies/releases/download/$the_date/libtiledb-$the_date.tar.gz
wget --quiet -P downloads/ \
  https://github.com/jdblischak/centralized-tiledb-nightlies/releases/download/$the_date/libtiledbvcf-$the_date.tar.gz
tar -C install-libtiledb -xzf downloads/libtiledb-$the_date.tar.gz
tar -C install-libtiledbvcf -xzf downloads/libtiledbvcf-$the_date.tar.gz
export LD_LIBRARY_PATH=$(pwd)/install-libtiledb/lib:$(pwd)/install-libtiledbvcf/lib
ldd install-libtiledbvcf/lib/libtiledbvcf.so | grep libtiledb.so
##         libtiledb.so.2.21 => /root/install-libtiledb/lib/libtiledb.so.2.21 (0x00007fdffbde6000)

# Clone TileDB-VCF
git clone https://github.com/TileDB-Inc/TileDB-VCF.git
cd TileDB-VCF/

# Build and install tiledbvcf-py
export LIBTILEDBVCF_PATH=${HOME}/install-libtiledbvcf/lib/
python -m pip install -v apis/python
##   -- Searching for libtiledbvcf in /root/TileDB-VCF/apis/python/../../dist/lib;/root/install-libtiledbvcf/lib/
##   -- Found libtiledbvcf: /root/install-libtiledbvcf/lib/libtiledbvcf.so
##
##   *** Installing project into wheel...
##   -- Install configuration: "Release"
##   -- Installing: /tmp/tmp7pmlfypj/wheel/platlib/tiledbvcf/libtiledbvcf.cpython-310-x86_64-linux-gnu.so
##   -- Set non-toolchain portion of runtime path of "/tmp/tmp7pmlfypj/wheel/platlib/tiledbvcf/libtiledbvcf.cpython-310-x86_64-linux-gnu.so" to "$ORIGIN"
##   -- Installing: /tmp/tmp7pmlfypj/wheel/platlib/tiledbvcf/libtiledbvcf.so
##   -- Installing: /tmp/tmp7pmlfypj/wheel/platlib/tiledbvcf/libhts.so.1.19.1
##   -- Installing: /tmp/tmp7pmlfypj/wheel/scripts/tiledbvcf
python -c "import tiledbvcf; print(tiledbvcf.version)"
## 0.31.1.dev7+g7cc19c10

# External shared objects are installed into Python package
ls /usr/local/lib/python3.10/dist-packages/tiledbvcf/*.so*
## /usr/local/lib/python3.10/dist-packages/tiledbvcf/libhts.so.1.19.1
## /usr/local/lib/python3.10/dist-packages/tiledbvcf/libtiledbvcf.cpython-310-x86_64-linux-gnu.so
## /usr/local/lib/python3.10/dist-packages/tiledbvcf/libtiledbvcf.so

# And the RUNPATH is edited to no longer point to external shared object
readelf -d /usr/local/lib/python3.10/dist-packages/tiledbvcf/libtiledbvcf.cpython-310-x86_64-linux-gnu.so | grep RUNPATH
##  0x000000000000001d (RUNPATH)            Library runpath: [$ORIGIN]
readelf -d apis/python/build/cp310-cp310-linux_x86_64/libtiledbvcf.cpython-310-x86_64-linux-gnu.so | grep RUNPATH
##  0x000000000000001d (RUNPATH)            Library runpath: [/root/install-libtiledbvcf/lib:]

# Build wheel
python -m pip wheel -v --wheel-dir=apis/python/dist apis/python
##   -- Searching for libtiledbvcf in /root/TileDB-VCF/apis/python/../../dist/lib;/root/install-libtiledbvcf/lib/
##   -- Found libtiledbvcf: /root/install-libtiledbvcf/lib/libtiledbvcf.so
##
##   *** Installing project into wheel...
##   -- Install configuration: "Release"
##   -- Installing: /tmp/tmp34k3_ajz/wheel/platlib/tiledbvcf/libtiledbvcf.cpython-310-x86_64-linux-gnu.so
##   -- Set non-toolchain portion of runtime path of "/tmp/tmp34k3_ajz/wheel/platlib/tiledbvcf/libtiledbvcf.cpython-310-x86_64-linux-gnu.so" to "$ORIGIN"
##   -- Installing: /tmp/tmp34k3_ajz/wheel/platlib/tiledbvcf/libtiledbvcf.so
##   -- Installing: /tmp/tmp34k3_ajz/wheel/platlib/tiledbvcf/libhts.so.1.19.1
##   -- Installing: /tmp/tmp34k3_ajz/wheel/scripts/tiledbvcf

# The external shared objects are copied into the wheel
unzip -l apis/python/dist/tiledbvcf-0.31.1.dev7+g7cc19c10-cp310-cp310-linux_x86_64.whl | grep '.so'
##   1536208  2024-05-01 02:18   tiledbvcf/libhts.so.1.19.1
##    254392  2024-05-01 18:20   tiledbvcf/libtiledbvcf.cpython-310-x86_64-linux-gnu.so
##   3585408  2024-05-01 02:18   tiledbvcf/libtiledbvcf.so

xref: #701, #702

jdblischak avatar May 01 '24 18:05 jdblischak