PyBaMM
PyBaMM copied to clipboard
[WIP] Download IDAKLU from pybammsolvers
Description
This will separate the IDAKLU C++ code from pybamm.
Type of change
This should speed up CI by skipping the build of the C++ code.
- [x] Optimization (back-end change that speeds up the code)
Key checklist:
- [x] No style issues:
$ pre-commit run(or$ nox -s pre-commit) (see CONTRIBUTING.md for how to set this up to run automatically when committing locally, in just two lines of code) - [x] All tests pass:
$ python run-tests.py --all(or$ nox -s tests) - [x] The documentation builds:
$ python run-tests.py --doctest(or$ nox -s doctests)
You can run integration tests, unit tests, and doctests together at once, using $ python run-tests.py --quick (or $ nox -s quick).
Further checks:
- [x] Code is commented, particularly in hard-to-understand areas
- [x] Tests added that prove fix is effective or that feature works
A new link error cropped up, but it looks like we could get a lot of savings on time with this update.
Edit: Most of the run time appears to be in the integration tests, so unfortunately the time savings are not as good as I would have hoped.
The linkage error is the same one as #3783, coming from CasADi's plugin system. I am not sure if it's worth fixing it, since it was fixed by @martinjrobins for the linear interpolant case by dropping down to Python but IIRC there wasn't a way in CasADi for doing it for the cubic
@agriyakhetarpal Yeah I was looking at that issue as well. As far as I can tell CasADI sets a path for plugins. I am trying to see if there is a decent workaround since this was part of #4464
My guess is that the wheels for the next release will be broken as well, but I have not confirmed it yet
There is a workaround for Linux and macOS, but not for Windows (different toolchain); sadly, it's not decent enough to include. I think I'll raise a PR upstream in CasADi to get one part of the linkage going and see if we can migrate to a non-MSVC toolchain (which can potentially help provide that workaround for this on Windows later on). It's been on my list of things to do for a while, but I've yet to do it.
This is fixed locally with this: export CASADIPATH=.venv/lib/python3.12/site-packages/casadi
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 98.66%. Comparing base (
a7253b8) to head (c9a75e2). Report is 135 commits behind head on develop.
Additional details and impacted files
@@ Coverage Diff @@
## develop #4487 +/- ##
===========================================
- Coverage 99.22% 98.66% -0.56%
===========================================
Files 303 303
Lines 23070 23224 +154
===========================================
+ Hits 22891 22914 +23
- Misses 179 310 +131
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
This is fixed locally with this:
export CASADIPATH=.venv/lib/python3.12/site-packages/casadi
Yes, won't work with Windows
I had a look at this. The linker error is the same as I came across for the case of linear interpolation. The solution there was to swap to using the direct casadi function rather than their plugin system, which won't work if the casadi function is evaluated in C++ for windows as we compile everything statically.
I think I might be able to access the direct bspline interface by calculating the spline coefficients in scipy and then use the casadi Function.bspline function to construct a bspline. Cross fingers this doesn't use the plugin system anywhere! Going to try this out in https://github.com/pybamm-team/PyBaMM/issues/4570
I had a look at this. The linker error is the same as I came across for the case of linear interpolation. The solution there was to swap to using the direct casadi function rather than their plugin system, which won't work if the casadi function is evaluated in C++ for windows as we compile everything statically.
I think I might be able to access the direct bspline interface by calculating the spline coefficients in scipy and then use the casadi
Function.bsplinefunction to construct a bspline. Cross fingers this doesn't use the plugin system anywhere!
Yeah I was going to approach this by seeing if I could just change the build itself. It is something that should work if we are compiling and delivering everything correctly. If that does not work, then I will look at workarounds for interpolation
I expect to work on this again next week, I have been caught up with other stuff
looks like there is still issues with the idaklu jax solver on windows, I can look into these?
@martinjrobins Sure if you want to look at it you are more than welcome. I am hopefully going to be able to take another look this evening
I recently got a Windows laptop so I could start looking into this stuff locally. Most of my commits to this branch recently have been me testing things for the release as I have been focused on getting that out the door
I tried to figure this one out today but no luck :( It's crashing with a fatal exception when jax tries to jit compile, I'm still in the dark as to why. It might be a threading issue as the problem is intermittant (occurs in about 95% of test runs). It might be triggered by some interaction with pytest because when I copy the test into a stand-alone script it works fine
For a stopgap solution, we can isolate these tests into their own xdist_group and allow only one worker to touch them at a time.
For a stopgap solution, we can isolate these tests into their own xdist_group and allow only one worker to touch them at a time.
Yeah that is my fallback option.
I want to take a closer look at the linking/delivery as well. We have failures on windows when you download the wheels:
- pybammsolvers has some crashed workers
- my i5 (without AVX-512 instructions) has ~45 test failures on both 24.9.0 and 24.11.0 when running tests with the wheels
- A colleague's i7 (with AVX-512 instructions) has ~25 test failures on both 24.9.0 and 24.11.0 when running tests with the wheels
So it appears that the tests are working when you test in the build environment, but not in a different environment. I will be digging into this more and see what I come up with
Hi - I took a very quick look at this yesterday and agree that it seems to be a threading issue. More specifically, jaxify() can only be called once per solver instance (this is the first test), which then caches the full solve result so the jax-wrapper can query samples without repeatedly re-running the solver. My suspicion is that running these tests in parallel is causing test pollution, probably because the test script currently instantiates the solver and jax wrapper objects at the start of the test script, not as a fixture for each test (although the ubuntu tests should also be failing?). Refactoring the tests with fixtures would be good to see if that resolves things - I can take a look at that if you like but if I'm right then the xdist_group solution should also work if you need a quick fix. I can't remember the precise details as to why we can't jaxify more than once per object, but I do remember that it was more complex than just the cache issue (something to do with the jax primitives...).
Just a small note to say that it is not just a matter of running the tests in serial to make them pass. I had to turn off both the pytest workers, and the faulthandler, pytest -n 0 -p no:faulthandler, before the tests would pass. With these options all the tests in test_idaklu_jax.py pass reliably
Testing out the skips and test refactor now with CI. I have some docs stuff to update then this should be mostly ready to go. I will do one more update to pybammsolvers to make sure that the versions of IDAKLU source files match
Tests pass, just need to do some documentation fixes
Additional documentation will be added to the pybammsolvers repo
@MarcBerliner, @martinjrobins Ok I think this is finally working. I am working on tests and docs for the other repo now
@agriyakhetarpal I know we have not solved the ARM64 or conda-forge issues yet, but how do you feel about getting this merged ASAP to see if we start getting issues reported?
Note: The changes from #4736 are not in pybammsolvers yet. The tests pass without the C++ side for now though. I am working on the pybammsolvers v0.0.5 release, but I have to do a bit of testing before it is ready. Hopefully I will finish that off today