fastjet icon indicating copy to clipboard operation
fastjet copied to clipboard

build: use cmake build system in fastjet and fastjet-contrib

Open lgray opened this issue 8 months ago • 16 comments

This branch will be where we figure out pip install . and cibuildwheel for scikit-hep/fastjet using the in-progress cmake build branches for fastjet/siscone/fastjet-contrib.

I need to get the fastjet-contrib edits into a github repository so we can submodule it correctly. Right now I have it working locally from the fastjet-contrib svn repository + patches.

At least on my laptop I can get the scikit-hep/fastjet extension compiling and linking. It doesn't yet function at runtime, but it's only a matter of time.

I'll send some further notifications when it's ready for a more serious look.

One thing we need to solve:

  • [x] MACOSX_DEPLOYMENT_TARGET=11.0 ~(instead of 14.0/13.0 which is what gets us all green right now)~
  • [x] Reduce windows wheel build time (presently 5-7m full build time ~40m-1hr installing dependencies~)
  • [x] Reduce windows wheel build size (presently ~16MB~ 4.3MB / wheel in windows, ~2MB in other OSes)

@matthewfeickert @henryiii

lgray avatar Apr 07 '25 18:04 lgray

OK - the present setup builds on my mac using the following commands:

git clone [email protected]:scikit-hep/fastjet.git -b cmake-build-work --recursive
cd fastjet
mkdir build
cmake -S . -B build -DCMAKE_BUILD_TYPE="Release" -DFASTJET_ENABLE_PYTHON=ON -DFASTJET_ENABLE_CGAL=ON -DFASTJET_PYTHON_PACKAGE_NAME=fastjet_cxx
cmake --build build --clean-first --parallel
cd build
python -c 'import _ext; print(_ext)'

Depending on your circumstances you may need to specify pybind11_DIR or turn off CGAL (I have it in from conda).

I'm gonna leave it at that for now y'all can feel free to poke around on the PR with additions, etc.

lgray avatar Apr 08 '25 01:04 lgray

OK - cool so now it's at least compiling (but not finishing the install!) with scikit-build-core. CGAL is on in all the cases, installed via system libs.

The errors I'm seeing don't make a lot of sense, so would now definitely appreciate help in getting the CI green again :D

lgray avatar Apr 08 '25 03:04 lgray

hmmm it would be nice to be able to browse the packages available in manylinux online.

anyway leaving it here for now. more than decent progress.

lgray avatar Apr 08 '25 04:04 lgray

@henryiii

git clone [email protected]:scikit-hep/fastjet.git -b cmake-build-work --recursive
cd fastjet
pip install . -vv
python -c 'import fastjet; print(fastjet)'

lgray avatar Apr 08 '25 16:04 lgray

I'm also not sure what's up with this metadata error in the CI install. This doesn't happen locally.

lgray avatar Apr 08 '25 17:04 lgray

Nice, only the macos wheels are failing due to what looks like overlinking?

delocate.libsana.DelocationError: Already planning to copy library with same basename as: libgmp.10.dylib

Still, extremely positive and especially given how much this has simplified the installing and wheel building process.

lgray avatar Apr 08 '25 23:04 lgray

@gavinsalam FYI this is what is driving changes in the draft PRs over in fastjet and siscone. Once we get this building wheels in all cases then requirements are satisfied on my side. As you can see it's pretty close!

So long as we're careful to reconcile your requirements with the requirements here we shouldn't run into too much trouble. This will also have a side effect of making the new fastjet/siscone/contrib cmake setup fairly robust; cibuildwheel is picky.

lgray avatar Apr 08 '25 23:04 lgray

Wheels Built

lgray avatar Apr 09 '25 14:04 lgray

Now that that particularly angry yak is shaved. Let's clean this up and ship it. :-) (after the fastjet/siscone/fastjet-contrib releases, etc.)

lgray avatar Apr 09 '25 14:04 lgray

if CGAL is providing too complicated on Windows, it's probably safe to leave it out for now. The switchover to CGAL happens only at relatively high multiplicities. The heuristics are given in ClusterSequence::_best_strategy(), https://gitlab.com/fastjet/fastjet/-/blob/master/src/ClusterSequence.cc?ref_type=heads#L650

Specifically, at large R, the change happens for N=100k (anti-kt) or 40k (kt). For smaller R values the switch to NlnN is at even higher R. Very few users will have that kind of multiplicity, even with full HL-LHC pileup. The C/A algorith has lower thresholds but doesn't need CGAL for its NlnN strategy.

gavinsalam avatar May 08 '25 06:05 gavinsalam

@gavinsalam Thanks for the info! It's not CGAL that's causing problems here (though it does take a long time to install in windows, I have some ideas to mitigate that), it's weirdly strict standards compliance from MSVC + fastjet-contrib not having been completely vetted.

With a little more fiddling it should be working! Only two compilation errors left that I see. The only unfortunate thing is the hour long build time right now, but since I have it working I don't want to now remove CGAL. 😒

lgray avatar May 08 '25 12:05 lgray

Cool - windows tests pass.

Now to shoehorn it into cibuildwheel.

lgray avatar May 09 '25 20:05 lgray

@matthewfeickert @henryiii I've got the windows wheel matrix fully up and running. I've added two tasks that I would appreciate help on now that we've got working recipes. Any help you can provide is deeply appreciated in these directions, it'll vastly help with maintainability.

lgray avatar May 10 '25 00:05 lgray

A quick check of turning off debug info in the windows wheels brought it down to 4.5MB on my home desktop. That's reasonable already. I'll put it in the full wheel matrix if everything works out in the short-form CI.

lgray avatar May 10 '25 00:05 lgray

looks like things are failing due to gitlab getting angry. I'll start 'em again tomorrow.

lgray avatar May 10 '25 02:05 lgray

Not too bad, down to ~7 minutes for the windows build now.

Will propagate it later to the wheels.

lgray avatar May 14 '25 22:05 lgray

Nice! The dll build allows us to get to 2.4MB wheels on windows, which is commensurate with the other OS's wheel sizes. I think it can be made almost exactly the same as the other OS's when we're using the dll version of siscone.

I think this is now very well under control.

lgray avatar Jun 04 '25 21:06 lgray

OK - there we go now we ship the complete fastjet-contrib uni-library. All plugins except for pxcone are enabled. It's more or less fully featured and ~future proof.

lgray avatar Jun 05 '25 15:06 lgray

OK - this latest change is using a top-level only CMake setup for fastjet-contrib. Somehow this has broken sdist, @matthewfeickert @henryiii a look would be appreciated.

I should be able to converge on a nice patchset for fastjet-contrib at this point.

The unfortunate thing is getting all the variants working does require code changes in fastjet-contrib, but they're relatively minimal.

lgray avatar Jun 06 '25 16:06 lgray

@gavinsalam You may find the top-level-only cmake implementation for fastjet-contrib here:

  • https://github.com/lgray/fastjet-contrib/pull/1

Right now it includes the necessary patches for windows. It's not actually required to apply them for unix-likes.

lgray avatar Jun 06 '25 18:06 lgray

Unsurprising, but there are no numpy windows arm64 wheels yet. Fastjet itself actually finished compiling though, we'll be ready when it's available.

lgray avatar Jun 27 '25 17:06 lgray

OK - numpy does build win-arm64 wheels but only for py310 and greater. After that what breaks in the tests is some stuff that's missing the win-arm64 rust toolchain. Rather close to being there, should check again in a couple months.

lgray avatar Jun 27 '25 21:06 lgray

You can add the Rust toolchain (see https://github.com/actions/partner-runner-images/issues/77), and I think it's coming soon anyway. I'd skip older Python's on Win ARM, there really aren't legacy users there anyway.

henryiii avatar Jun 28 '25 03:06 henryiii

Thanks @henryiii, let's see if it blends.

lgray avatar Jun 28 '25 13:06 lgray

Yikes, now it dies on building pyarrow.

lgray avatar Jun 28 '25 14:06 lgray

What I've decided to do in the mean time is just skip tests on windows-11-arm64. So much of the infrastructure is missing or won't build at the moment, and the library itself builds fine and tests fine in x64 windows and arm64 linux/macos. So we're very likely safe.

lgray avatar Jun 28 '25 15:06 lgray

This is now waiting for the release of fastjet-3.5.1. I'm going to flip it out of draft mode.

lgray avatar Jun 30 '25 21:06 lgray

OK - this is now using fastjet-3.5.1

lgray avatar Jul 01 '25 13:07 lgray

@gavinsalam I shall require your blessing to merge this. :-)

I think the only point of concern is using the cmake patchset atop fastjet-contrib, but it is localized entirely to the build step of this project. It's not presenting it to users for wider consumption, so it's much less of a risk.

lgray avatar Jul 01 '25 13:07 lgray

@gavinsalam I shall require your blessing to merge this. :-)

I think the only point of concern is using the cmake patchset atop fastjet-contrib, but it is localized entirely to the build step of this project. It's not presenting it to users for wider consumption, so it's much less of a risk.

This is blessing with regard to the Windows wheels, or specifically the fjcontrib changes? Still haven't had a chance to go through those (what bandwidth there was on our end went to 3.5.1)

gavinsalam avatar Jul 02 '25 13:07 gavinsalam