Scikit-build-core build system rework
This PR reworks build system for TileDB-Py to scikit-build-core + CMake. To comply with our packaging requirements I made sure that the TILEDB_PATH environment variable is properly handled. In all cases RPATH is set to the location where TileDB core library is present.
Additionally, Github workflows were added to create, test and upload PyPI wheels and sdist packages.
[sc-49832]
Let's not forget to update the docs: https://docs.tiledb.com/main/how-to/installation/building-from-source/python It would be good to have a draft PR here: https://github.com/TileDB-Inc/TileDB-Developer-Docs before merging this PR.
Let's not forget to update the docs: https://docs.tiledb.com/main/how-to/installation/building-from-source/python It would be good to have a draft PR here: https://github.com/TileDB-Inc/TileDB-Developer-Docs before merging this PR.
same for this: https://github.com/TileDB-Inc/TileDB-Py/blob/367dbe97c0b396b1a413a0b70dae5a81e1f216fa/CONTRIBUTING.md
I also re-implemented daily tests CI/CD: https://github.com/dudoslav/TileDB-Py/actions/runs/9596846267
Wheel contents difference: https://www.diffchecker.com/4spoJ46a/
Latest diff with Licences: https://www.diffchecker.com/SxLbtGmC/
Latest Diff: https://www.diffchecker.com/eFdvkppd/
Thanks
- we can exclude all of
docandci - seems like this is a local thing that should not have been included?
Thanks
* we can exclude all of `doc` and `ci` * seems like this is a local thing that should not have been included?![]()
I removed doc and ci folders and made sure that I create wheel in a clean environment. This is the result right now:
https://www.diffchecker.com/9i4Ge9l2/
Another difference that I found is that the original setup.py approach puts the libtiledb.so in a different place. I did not notice any difference when running the wheel or testing it, but I want to mention it:
I have noticed the following packaging behavior:
When running pipx run build if TileDB core library is found in the system it gets picked up and used as the core TileDB library for the wheel. This however never happens in our CI/CD when deploying wheels, so it should be fine, but it can lead to incorrect versions of TileDB included inside wheel for packages produced manually, for example in a Conda environment with TileDB installed in it.
Testing if Conda feedstock will work: https://github.com/conda-forge/tiledb-py-feedstock/pull/224
Seems like conda packages were able to produce: https://github.com/conda-forge/tiledb-py-feedstock/pull/224
I tested the shared object copying behavior similar to how I did for TileDB-VCF (https://github.com/TileDB-Inc/TileDB-VCF/issues/717).
I ran into 2 issues:
- When bundling
libtiledb.sointo the installed Python package, I get the errorAttributeError: module 'tiledb.cc' has no attribute 'Config'when runningimport tiledb - When using an external
libtiledb.so, the compilation fails
docker run --rm -it ubuntu:22.04
# Setup
apt-get update
apt-get install --yes git python-is-python3 python3 python3-pip python3-venv unzip wget
cd
mkdir downloads install-libtiledb
# Install nightly binary
the_date=2024-04-08
wget --quiet -P downloads/ \
https://github.com/jdblischak/centralized-tiledb-nightlies/releases/download/$the_date/libtiledb-$the_date.tar.gz
tar -C install-libtiledb -xzf downloads/libtiledb-$the_date.tar.gz
export LD_LIBRARY_PATH=$(pwd)/install-libtiledb/lib
# Clone TileDB-Py
git clone https://github.com/TileDB-Inc/TileDB-Py.git
cd TileDB-Py/
git checkout db/scikit-build-core
# Build and install TileDB-Py **with** bundled libtiledb
python -m pip install -v .
## -- Downloading TileDB default version ...
## -- [download 100% complete]
## -- Adding "libtiledb" into install group
##
## -- Installing: /tmp/tmppmr6xlbn/wheel/platlib/tiledb/libtiledb.so.2.24
##
## Successfully installed numpy-2.0.0 tiledb-0.30.3.dev109+gd18362b
ls /usr/local/lib/python3.10/dist-packages/tiledb/*.so*
## /usr/local/lib/python3.10/dist-packages/tiledb/cc.cpython-310-x86_64-linux-gnu.so
## /usr/local/lib/python3.10/dist-packages/tiledb/libtiledb.cpython-310-x86_64-linux-gnu.so
## /usr/local/lib/python3.10/dist-packages/tiledb/libtiledb.so.2.24
## /usr/local/lib/python3.10/dist-packages/tiledb/main.cpython-310-x86_64-linux-gnu.so
ldd /usr/local/lib/python3.10/dist-packages/tiledb/libtiledb.cpython-310-x86_64-linux-gnu.so | grep libtiledb
## libtiledb.so.2.24 => /usr/local/lib/python3.10/dist-packages/tiledb/libtiledb.so.2.24 (0x00007f3ac2e8a000)
readelf -d /usr/local/lib/python3.10/dist-packages/tiledb/libtiledb.cpython-310-x86_64-linux-gnu.so | grep RUNPATH
## 0x000000000000001d (RUNPATH) Library runpath: [$ORIGIN]
python -c "import tiledb; tiledb.libtiledb.version()"
## Traceback (most recent call last):
## File "<string>", line 1, in <module>
## File "/root/TileDB-Py/tiledb/__init__.py", line 22, in <module>
## from .array_schema import ArraySchema
## File "/root/TileDB-Py/tiledb/array_schema.py", line 10, in <module>
## from .attribute import Attr
## File "/root/TileDB-Py/tiledb/attribute.py", line 9, in <module>
## from .ctx import Ctx, CtxMixin
## File "/root/TileDB-Py/tiledb/ctx.py", line 16, in <module>
## class Config(lt.Config):
## AttributeError: module 'tiledb.cc' has no attribute 'Config'
python -m pip uninstall --yes tiledb
# Build and install TileDB-Py **without** bundled libtiledb
export TILEDB_PATH=${HOME}/install-libtiledb/
python -m pip install -v .
## -- Found external TileDB core library
## -- Setting RPATH for targets "main" and "libtiledb" to /root/install-libtiledb/lib
## -- Setting RPATH for target "cc" to /root/install-libtiledb/lib
##
## In file included from /tmp/tmp8dwduzb1/build/tiledb/libtiledb.cc:1248:
## /root/install-libtiledb/include/tiledb/tiledb.h:2973:34: note: declared here
## 2973 | TILEDB_DEPRECATED_EXPORT int32_t tiledb_array_delete_fragments(
## | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## ninja: build stopped: subcommand failed.
##
## *** CMake build failed
## error: subprocess-exited-with-error
##
## × Building wheel for tiledb (pyproject.toml) did not run successfully.
## │ exit code: 1
## ╰─> See above for output.
##
## note: This error originates from a subprocess, and is likely not a problem with pip.
## full command: /usr/bin/python /usr/lib/python3/dist-packages/pip/_vendor/pep517/in_process/_in_process.py build_wheel /tmp/tmpkbu_gaxq
## cwd: /root/TileDB-Py
## Building wheel for tiledb (pyproject.toml) ... error
## ERROR: Failed building wheel for tiledb
## Failed to build tiledb
## ERROR: Could not build wheels for tiledb, which is required to install pyproject.toml-based projects
@ihnorton explained to me the source of the import error I reported above. It's because I ran it within the Git repo, and by default Python looks for ./module.py and ./module/__init__.py. Thus it ran the local one instead of the one installed in site-packages. Thus the fix it to either run an editable install (-e) or move outside of the Git repo.
The reprex below confirms that the import works fine outside of the Git repo:
docker run --rm -it ubuntu:22.04
# Setup
apt-get update
apt-get install --yes git python-is-python3 python3 python3-pip python3-venv unzip wget
cd
# Clone TileDB-Py
git clone https://github.com/TileDB-Inc/TileDB-Py.git
cd TileDB-Py/
git checkout db/scikit-build-core
# Build and install TileDB-Py **with** bundled libtiledb
python -m pip install -v .
# Run import test outside of Git repo
cd
python -c "import tiledb; print(tiledb.libtiledb.version())"
## (2, 24, 0)
Also, I switched to using the 2.24.2 release tarball, and I was able to compile TileDB-Py against the external libtiledb. The shared object isn't copied, and all the linking looks good.
docker run --rm -it ubuntu:22.04
# Setup
apt-get update
apt-get install --yes git python-is-python3 python3 python3-pip python3-venv unzip wget
cd
mkdir downloads install-libtiledb
# Install 2.24.2 release binary
# https://github.com/TileDB-Inc/TileDB/releases/tag/2.24.2
wget --quiet -P downloads/ \
https://github.com/TileDB-Inc/TileDB/releases/download/2.24.2/tiledb-linux-x86_64-2.24.2-76cd03c.tar.gz
tar -C install-libtiledb -xzf downloads/tiledb-linux-x86_64-2.24.2-76cd03c.tar.gz
export LD_LIBRARY_PATH=$(pwd)/install-libtiledb/lib
# Clone TileDB-Py
git clone https://github.com/TileDB-Inc/TileDB-Py.git
cd TileDB-Py/
git checkout db/scikit-build-core
# Build and install TileDB-Py **without** bundled libtiledb
export TILEDB_PATH=${HOME}/install-libtiledb/
python -m pip install -vv .
## -- Found external TileDB core library
## -- Setting RPATH for targets "main" and "libtiledb" to /root/install-libtiledb/lib
## -- Setting RPATH for target "cc" to /root/install-libtiledb/lib
ls /usr/local/lib/python3.10/dist-packages/tiledb/*.so*
## /usr/local/lib/python3.10/dist-packages/tiledb/cc.cpython-310-x86_64-linux-gnu.so
## /usr/local/lib/python3.10/dist-packages/tiledb/libtiledb.cpython-310-x86_64-linux-gnu.so
## /usr/local/lib/python3.10/dist-packages/tiledb/main.cpython-310-x86_64-linux-gnu.so
ldd /usr/local/lib/python3.10/dist-packages/tiledb/libtiledb.cpython-310-x86_64-linux-gnu.so | grep libtiledb
## libtiledb.so.2.24 => /root/install-libtiledb/lib/libtiledb.so.2.24 (0x00007f7dda6f6000)
readelf -d /usr/local/lib/python3.10/dist-packages/tiledb/libtiledb.cpython-310-x86_64-linux-gnu.so | grep RUNPATH
## 0x000000000000001d (RUNPATH) Library runpath: [/root/install-libtiledb/lib]
cd
python -c "import tiledb; print(tiledb.libtiledb.version())"
## (2, 24, 2)
Lastly, I got this PR working in a branch of my centralized-tiledb-nightlies, so I'll be able to quickly migrate the nightly TileDB-Py build after this is merged.
The wheel size is most probably due to strip=False and build_type=RelWithDebInfo. Latest commit removed these options, lets see if the size will get smaller.
Latest wheels are about 15MB in size