Numpy 2.0
Description
Ruff suggested a few things to change but I cannot test because I install both numpy=2.0.0.rc1 and numba (any version including the most recent 0.59.1) (incompatibilities in python abi).
Related Issue
- [x] Closes #688
- [ ] Related to #
Checklist
- [x] Checked that the pre-commit linting/style checks pass
- [ ] Included tests that prove the fix is effective or that the new feature works
- [ ] Added necessary documentation (docstrings and/or example notebooks)
- [x] If you are a pro: each commit corresponds to a relevant logical change
Type of change
- [ ] New feature / enhancement
- [ ] Bug fix
- [ ] Documentation
- [x] Maintenance
- [ ] Other (please specify):
Removed the custom Complex C types... I have no context as to why they were necessary, so that definitely needs careful review.
This is code that seems to have been there mostly from the beginning.
Edit seems to be needed to define operations like * :(
Edit seems to be needed to define operations like
*:(
In what context? If it is in array indexing, then this syntax was only added on Python 3.11 (I found out recently about this).
No, this is C(++) code, multiplication of complex scalars.
Maybe @michaelosthege finds this interesting xD?
These are the breaking changes: https://numpy.org/devdocs/numpy_2_0_migration_guide.html#complex-types-underlying-type-changes
Friendly note that it will be necessary to revert 1235d08b1c75aa1458beb9f978d7ba3dfe3424db on the next rebase in order to unpin numpy<2.
Now with numpy 2.0 released it would be nice to be compatible with it.
Now with numpy 2.0 released it would be nice to be compatible with it.
There's some esoteric C++ code for complex variables that I didn't want to dig too much into
Now with numpy 2.0 released it would be nice to be compatible with it.
There's some esoteric C++ code for complex variables that I didn't want to dig too much into
Is this PR stalled? The numpy release notes seem to suggest what changes need to be made (replace x.real and x.imag with calls to some new functions in the numpy C-API). I can try to implement those changes.
There's something a bit more complex. We are using c++ to overload operators of "scalar" complex numbers. This goes beyond the standard use of numpy and their advice is not sufficient here.
Okay, it's possible I'm misunderstanding. I thought that code is defining a struct (e.g. pytensor_complex64) that inherits from npy_cdouble (which is an alias for np_complex64) etc., and it looks like the numpy migration notes say that for npy_cdouble you should no longer access the real and imaginary parts as fields, but instead use these functions npy_creal and npy_cimag. So all uses of .real etc. in the overloaded operators (and possibly some other methods) would need to be changed.
It does seem a bit annoying in that npy_creal needs to be replaced with npy_creall for pytensor_complex128
I'm not certain I can fix it, but I'd be interested in trying. (Although if the above is completely off base, please let me know... in that case, I certainly don't understand the problem.)
@brendan-m-murphy I think you're on the right track. The part where we overload operators was where I stopped at. Compiler wasn't happy with my changes to use the new real/complex accessors here https://github.com/pymc-devs/pytensor/blob/main/pytensor%2Fscalar%2Fbasic.py#L527-L538
Feel free to take a stab at it. You can push to this branch or if you don't have permissions start a new PR either on top of my branch or directly to main
You'll have to revert "Remove custom Complex type", that was just to investigate what part of the codebase actually depended on it
Maybe this table is helpful?
Also there is this note in the NumPy 2.0.0 release notes and this one on NumPy 1 & 2 compatibility. Namely NumPy 2 is just using C99's complex types
Maybe this table is helpful?
Also there is this note in the NumPy 2.0.0 release notes and this one on NumPy 1 & 2 compatibility. Namely NumPy 2 is just using C99's complex types
Yes, I think using C99 types is the reason for moving to npy_creal: the function for getting the real part of a complex double in C99 is creal, etc.
Just an update: I've updated the code for ScalarType, which reduced the number of failing tests. (This is the PR on my fork: https://github.com/pymc-devs/pytensor/pull/905. It includes all the commits on this PR up to the commit that removed the complex variable code.)
There are still some failing tests related to the change to complex numbers in Numpy in AdvancedIncSubtensor1, around lines 2246-2289 in tensor/subtensor.py. I'm not sure what this part of the code is doing yet, although the changes for numpy 2.0 compatibility should still be replacing .real etc. with the right npy_creal function.
An issue I had with the ScalarType code, which is also going to a problem for the AdvancedIncSubTensor1 class is that the numpy C types npy_complex64 and npy_complex128 are aliases of things like npy_cfloat, npy_cdouble, npy_clongdouble, and the functions to get/set the real and imaginary parts are defined in terms of float, double, long double. So we need to "reverse" the typedefs to figure out if npy_complex64 is npy_cfloat or npy_cdouble etc.
(These typedefs are in numpy/_core/include/numpy/npy_common.h.)
The AdvancedIncSubtensor1 also defines code for npy_complex32, which as far as I can tell, isn't defined, so I guess we should remove that.
The updates I made will work for 32 and 64 bit machines, assuming that long double on the 32-bit machine is actually 64-bit. To get more options, I suppose we could copy numpy's set up of defining everything in terms of float, double, long double, then having set-up specific aliases for types in terms of the number of bits.
I made some small changes to handle the typedefs for npy_cdouble etc. a bit smoother. I think those changes are pretty much ready to merge into this branch. There are a lot of mypy issues, but I suspect they're not related to my changes, which were mostly to the C code. I'll try to tidy it up and then request a review.
I think there are serious problems with the tensor.subtensor.py module. The "map iter" part of the Numpy C-API is no longer public, so I think the C code in AdvancedIncSubtensor1 will need to be change substantially. I can ask on the numpy mailing list what they would suggest. (The PR that removes is https://github.com/numpy/numpy/pull/24274, which is part of the Numpy 2.0 release.)
There are some suggestions on that PR and this Aesara issue: https://github.com/aesara-devs/aesara/issues/1506
I could try to work on this, but I suspect it would take me longer than the complex type updates.
Thanks @brendan-m-murphy. For context the AdvancedIncSubtensor1 is a subset of Numpy advanced indexing that sets or increments a vector of values (possibly broadcasted) to a vector of integer indices along the first dimension of an array (and nothing else).
It is the only form of AdvancedIndexing implemented in C. Looking at the C implementation, it seems like quite a mess, and possibly overly complicated/inefficient. Like why is the set_or_inc done in the innermost function, and in a way that the compiler will never know which case it is?
Anyway, since we are trying to part ways with the C backend, I think we can make this operator C-implementation conditional on the numpy version that is installed? If we raise NotImplementedError from def c_code(...) for numpy >= 2.0, pytensor will just default to the python implementation instead.
Otherwise if someone feels like doing C-implementation, help is welcome. I think the only issue is replacing the MapIter logic
@brendan-m-murphy regarding complex32, I also don't recall it being a thing these days. Just drop it
Last thing @brendan-m-murphy if possible, it would be great if the complex stuff worked both in numpy 1.x and 2.x. Maybe it does, just stating in case it was never brought up.
Last thing @brendan-m-murphy if possible, it would be great if the complex stuff worked both in numpy 1.x and 2.x. Maybe it does, just stating in case it was never brought up.
It doesn't currently, but I think it's an easy change (I'm pretty sure the numpy release notes say how). I suppose I could make a PR against main then (I'd probably do a new branch for that with just the complex variable changes). Also, I could make the updates for complex variables to subtensor.py, so if the MapIter code gets fixed, then there's nothing left to patch up there.
We can tweak the test CI to run with both versions of numpy
Thanks @brendan-m-murphy. For context the
AdvancedIncSubtensor1is a subset of Numpy advanced indexing that sets or increments a vector of values (possibly broadcasted) to a vector of integer indices along the first dimension of an array (and nothing else).It is the only form of AdvancedIndexing implemented in C. Looking at the C implementation, it seems like quite a mess, and possibly overly complicated/inefficient. Like why is the
set_or_incdone in the innermost function, and in a way that the compiler will never know which case it is?Anyway, since we are trying to part ways with the C backend, I think we can make this operator C-implementation conditional on the numpy version that is installed? If we raise NotImplementedError from
def c_code(...)for numpy >= 2.0, pytensor will just default to the python implementation instead.Otherwise if someone feels like doing C-implementation, help is welcome. I think the only issue is replacing the MapIter logic
So if the C code isn't available, the PerformLinker would be used for an Apply node, even if the mode says to use the CLinker?
This (rather old) commit seems to already implement a version of the fancy indexing increment using numpy for the "perform" function in : https://github.com/pymc-devs/pytensor/commit/c60bd3b29d67c4c49c2bfc65464329fc44f829c7 (all the C code is missing here, but that was moved into the class a few years ago).
The linker doesn't change, but the default already handles mixing C and Python implementations. When an Op raises NotImplementedError for C code it builds a thunk around the python perform method instead.
Oh okay, so raising the NotImplementedError would use np.add.at, which seems to be the suggested replacement for using MapIter: https://github.com/aesara-devs/aesara/issues/1506#issue-1824485598. (This is exactly what you said, but it took me a while to figure out what code might be used with the Python linker, and I didn't read far enough to see that the CLinker would default to that.)
I guess there could be a performance hit, but it sounds like the numpy team are working on improving np.add.at, so if we do have a slow down, maybe it is something they'll fix.
I'll try to get my PR in soon. I had to move to other things at work, but should be able to finish it in an evening or two.
The slowdown seems worthwhile specially if np.add.at gets faster (and we make sure to use it in the Python perform method). This code is pretty hard to maintain, specially with numpy deprecating it. I'm a bit puzzled that numpy C-API has nothing very useful for iterating over fancy indexes
Just an update: raising a not implemented error for the advanced inc subtensor when using numpy 2.0 removes the errors around the MapIter (as expected), but it revealed more errors related to the complex arithmetic. I haven't had a lot of time to look into these new errors, but I'll try to find out more this week.
Just an update: raising a not implemented error for the advanced inc subtensor when using numpy 2.0 removes the errors around the MapIter (as expected), but it revealed more errors related to the complex arithmetic. I haven't had a lot of time to look into these new errors, but I'll try to find out more this week.
Thanks 🙏 Hopefully we're getting there!
Some of the tests in tests/tensor/random/rewriting are failing with: TypeError: Scalar(float32, shape=()) cannot store accurately value 1e-06, it would be represented as 9.999999974752427e-07. If you do not mind this precision loss, you can: 1) explicitly convert your data to a numpy array of dtype float32, or 2) set "allow_input_downcast=True" when calling "function".
This is for test_Subtensor_lift with param tuple 17, 18, and a few more.
The error comes from trying to create a TensorConstant from 1e-6. Ultimately, the check np.all(np.array(1e-6) == np.array(1e-6).astype('float32')) fails, leading to this error. If you do float(np.array(1e-6).astype('float32') or np.array(1e-6).astype('float32').astype('float64'), you get 9.999999974752427e-07.
Locally (on a macbook m2), this check fails with numpy 2.0 and numpy 1.26. I guess I don't see why this error shouldn't occur, since to get to this point in TensorType.filter (around line 214), we know that the data we're trying to convert is 1) a float, 2) not the same type as config.floatX, and 3) for this test, not a value that can be represented exactly in binary floating point. (The same problem doesn't happen if you replace 1e-6 by 2**-6.)
I haven't tested that the updates to complex numbers work for numpy 1.26 yet, but with numpy 2.0, basic tensor operations work with complex types now. The tests with the latest updates are here: https://github.com/brendan-m-murphy/pytensor/actions/runs/10144114724/job/28047006005?pr=2 I'll update the workflow to run against numpy 1.26 as well.
That failure doesn't seem too worrisome, I guess it has to do with change in behavior of numpy regarding casting.
The float32 tests are always a bit of a pain around here. Feel free to cast explicitly to make the test pass or tweak the test parameters so we're not testing such large values. This is completely unrelated to the main goal of those tests
I'm still working through the failing tests. For test_inner_composite in tests/scalar/test_loop.py, there is a composite function that should always evaluate to 1, but when a float16 is used for the variable in the composite, there is an overflow. It seems like this test was designed to see if this is avoided, since for a smaller number of n_steps, the overflow doesn't happen, and n_steps is increased in this part. This doesn't fail on numpy 1.26. Maybe this has to do with the upcasting code?
For TestScan.test_inner_graph_cloning in tests/scan/test_basic.py, the test fails on numpy 2.0 as is, but if Mode(optimizer=None) is changed to Mode(linker="cvm") or Mode(linker="py"), then the test passes on numpy 2.0. It passes as is on numpy 1.26. (Actually, just removing the mode argument allows it to pass, but I guess this selects cvm by default?) For now, I've removed the mode argument, but maybe it is worth investigating further.
For the random variable tests, I had to replaced copy(rng) with deepcopy(rng). I'm not really sure why this needed to change, Generator isn't mentioned in the numpy 2.0 release notes (as far as I can tell).
In tests/tensor/basic.py, test_autocast_custom is failing because pt.as_tensor_variable(1.1) returns a float32 (here, using with autocast_float_as("float32", "float64") makes no difference. We have (pt.fvector() + 1.1).dtype == 'float32' in either case. It seems like the problem(?) is that 1.1 == np.asarray(1.1,).astype('float32') is true in numpy 2.0, but not in numpy 1.26. (In numpy 2.0,it evaluates to np.True_ and in numpy 1.26 it evaluates to False)