strobealign icon indicating copy to clipboard operation
strobealign copied to clipboard

Release Python bindings to PyPI

Open althonos opened this issue 8 months ago • 2 comments

Hi! I was about to have a look at making bindings for strobealign but you made some already (and it turns out I already contributed to them in the past, my memory is that of a goldfish lol 😃) Would you consider releasing the library to PyPI? It would make it much easier to integrate with other Python projects.

So far I can get the project to build with cibuildwheel with minimal changes to pyproject.toml so making wheels would be quite easy as well, and you could do that in CI to release them on tagged commits (I have an example here for a large matrix of Python versions and platforms: https://github.com/althonos/pyjess/blob/main/.github/workflows/package.yml, but in essence you just need the pypa/cibuildwheel action).

The updated pyproject.toml:

[build-system]
build-backend = "setuptools.build_meta"
requires = [
    "setuptools>=42",
    "wheel",
    "scikit-build>=0.13",
    "cmake>=3.18",
    "ninja; platform_system!='Windows'",
    "nanobind>=0.2.0",
]

[project]
name = "strobealign"
version = "0.16.0"
license = "MIT"
requires-python = ">=3.10"

[tool.pytest.ini_options]
testpaths = ["tests"]

[tool.cibuildwheel.linux]
before-all = "yum install -y isa-l-devel"

I would also suggest switching to scikit-build-core instead of scikit-build for better forward compatibility, this would also avoid to manually require cmake and ninja as scikit-build-core handles the build dependencies a bit better (see https://scikit-build-core.readthedocs.io/). Also happy to help / PR for that 👍

althonos avatar Apr 27 '25 11:04 althonos

Hi, can you say for what you would want to use the bindings?

The bindings exist mostly because I wanted to be able to quickly experiment with the mapping algorithm. They cover only those parts of strobealign that I needed for these experiments. They are otherwise quite incomplete. At the moment, you can index a reference and get a list of NAMs (matches) for a query, but you cannot run extension alignment or get SAM output.

We also keep refactoring strobealign, and then the bindings need to change as well. At this point, I don‘t want to commit to a stable API.

That said, I’m not opposed to publishing the bindings on PyPI; they would just need to be very clearly marked as experimental and incomplete.

By the way, I’m in the process of porting strobealign to Rust; this is done in a separate repo: https://github.com/marcelm/rstrobes I think I’m 90% done, but one of the things missing are the Python bindings. I’m not totally certain whether we’ll switch to the Rust version, but if we are, it may not make so much sense to spend time on the C++ bindings ...

marcelm avatar Apr 28 '25 14:04 marcelm

We are re-implementing the PathSeq pipeline (https://software.broadinstitute.org/pathseq/) in pure Python, and that now gives us the opportunity to try aligners other than BWA for the decontamination and mapping parts, where in the original code only BWA is supported. This is not something we need immediately since for now we are still validating the new code so we're not gonna switch aligners immediately, but it would be cool to be able to use different aligners and just install them on the fly with pip.

I understand the lack of stable API but I think it should be fine to push to PyPI and break between versions, you are under an 0.* version and with an additional Development Status :: 3 - Alpha classifier it would be clear you expect things to break. But also agree that there's no point spending time on this if you are moving to Rust or refactoring a lot. I just meant to say that with just the changes above + a GitHub Actions workflow you could get the bindings released without effort on PyPI 👍

althonos avatar May 05 '25 12:05 althonos