PyMPDATA icon indicating copy to clipboard operation
PyMPDATA copied to clipboard

incompatibility with newwer Numba (threads; currently errors by default on Google Colab!)

Open thearia0 opened this issue 3 months ago • 1 comments

It doesn't happen every time the code is run on Google Colab but often Numba reports an error about not having enough threads

Image

it may be connected to this PR: https://github.com/open-atmos/PyMPDATA/commit/0a2e883dd1842c1879879d7bd3a15e26c8b2ad63

thearia0 avatar Nov 20 '25 18:11 thearia0

This is a known incompatibility of newer Numba versions with PyMPDATA. Essentially, Numba has changed its API, and we haven't found a way yet to adapt to the new one. Passing n_threads=1 to the Stepper constructor constitutes a workaround (but also disables multi-threading entirely)

slayoo avatar Nov 20 '25 18:11 slayoo

@Sfonxu, I'm looking around and cannot find any trace of a record of which is the actual Numba version where the incompatibility appeared, do you have any record of this investigation?

slayoo avatar Nov 22 '25 10:11 slayoo

I don't have anything written down really, but looking at some archived info I have from PyMPDATA-MPI development it would be somwhere around Numba >= 0.57.0. Lesser versions were exclusively used for Numba then and only after updating to P3.12 (which meant newer Numba versions) a lot of those threading issues started to become apparent.

EDIT: typos and more precise description

Sfonxu avatar Nov 22 '25 18:11 Sfonxu

but on CI we're using newer releases, and our sanity test passes, and there are no timeouts... https://github.com/open-atmos/PyMPDATA/blob/ed8d8c1c7dc6f0d921cb6253dd4e2d4be3be4e2a/setup.py#L34

slayoo avatar Nov 22 '25 20:11 slayoo

however, for Linux, we also only test with OpenMP! https://github.com/open-atmos/PyMPDATA/blob/ed8d8c1c7dc6f0d921cb6253dd4e2d4be3be4e2a/.github/workflows/tests.yml#L137-L138

slayoo avatar Nov 22 '25 20:11 slayoo

while Colab's default is Intel TBB:

Image

slayoo avatar Nov 22 '25 21:11 slayoo

and Numba docs quite explicitly mention that TBB might use fewer threads than other backends (https://numba.pydata.org/numba-doc/dev/developer/threading_implementation.html#caveats):

Image

slayoo avatar Nov 22 '25 21:11 slayoo

so it seems, that we should "simply" require that anyone using PyMPDATA picks omp as Numba Threading layer

slayoo avatar Nov 22 '25 21:11 slayoo

well. the warning message that we already print actually does suggest it!

Image

slayoo avatar Nov 22 '25 21:11 slayoo

but on CI we're using newer releases, and our sanity test passes, and there are no timeouts...

PyMPDATA/setup.py

Line 34 in ed8d8c1 13: "==0.61.2",

That omp line in tests did come up during investigating if I'm not mistaken. For my laptop w/ Linux if I want to run PyMPDATA I always have to change the threading backend to OpenMP to run any tests or whatnot.

It is possible to change the backend on Colab via adding this before any Numba-dependent code is run:

import numba
numba.config.THREADING_LAYER="omp"

And restarting the session (for instance via the menu under the run all button). This worked for me when testing around this PyMPDATA example on Colab. What I also noticed was that the first run of a session done without changing the backend also kind of worked sometimes, but I can't really confirm why.

Sfonxu avatar Nov 23 '25 16:11 Sfonxu

One other note is that in the Numba threading docs we read:

The default manner in which Numba searches for and loads a threading layer is tolerant of missing libraries, incompatible runtimes etc.

I wonder if that might also have something to do with this...

EDIT: source

Sfonxu avatar Nov 23 '25 16:11 Sfonxu

It is possible to change the backend on Colab via adding this before any Numba-dependent code is run:

import numba numba.config.THREADING_LAYER="omp"

We should then add it to all notebooks in PySDM and PyMPDATA and change the devops_tests requirement for the header cell accordingly. Perhaps setting an env. var would be simpler? (i.e. faster)

And restarting the session (for instance via the menu under the run all button).

If this is done at the top of the notebook, restart is not needed (as it does not change anything)

slayoo avatar Nov 23 '25 17:11 slayoo

a check for setting NUMBA_THREADING_LAYER in all notebooks is being worked on here: https://github.com/open-atmos/devops_tests/pull/52

Thanks @AgnieszkaZaba !

slayoo avatar Nov 24 '25 20:11 slayoo