RFC: wrap any array API standard compatible array
Preface: I'm not familiar with the history behind https://pint.readthedocs.io/en/stable/user/numpy.html#Array-Type-Support, but I have experience adding support for alternative array types to SciPy. For context, see https://numpy.org/neps/nep-0056-array-api-main-namespace.html#abstract, which superseded NEP 30.
The array API standard is being utilised in libraries like SciPy and scikit-learn to add support for array libraries like CuPy, PyTorch, and JAX. It seems feasible that Pint could wrap any library which is compatible with the standard.
The model for how this can be done is demonstrated in https://github.com/mdhaber/marray. Marray adds masked array support to standard-compatible arrays. Similar to Pint's Quantity, an MArray is some data together with a mask:
MArray(
Array([1, 2], dtype=array_api_strict.int64),
Array([False, True], dtype=array_api_strict.bool)
)
Marray exposes one function, get_namespace, which takes as input a standard-compatible namespace, and returns a new standard-compatible namespace with the additional capabilities of creating and correctly processing masked arrays. Under the hood, this namespace wraps each function to handle anything specific to masks, and otherwise dispatches to the underlying array namespace. So for pint, pint.get_namespace(torch) would return a namespace for pint-wrapped PyTorch tensors.
MArrays also expose this namespace via the __array_namespace__ method, which allows libraries to operate on them like any other standard array:
# get the namespace for input pint(torch) array x
xp = x.__array_namespace__()
y = xp.exp(x) # returns another pint(torch) array
z = xp.asarray([1, 2, 3]) # also a pint(torch) array
...
Marray currently passes almost the entirety of https://github.com/data-apis/array-api-tests. It would be cool to see if Pint could do the same!
This would bring a few benefits:
- for users, they can use pint with whatever standard-compatible library they like
- for libraries which depend on pint, they can also then work on supporting any standard-compatible arrays
- users can pass Pint
Quantitys to any library which supports standard-compatible arrays, like SciPy (eventually!)
@nabobalis I know you had expressed interest in this.
yes, there is interest. there have been a few issues relating to this in the past but it requires someone to start it! I suggest we create a seperate repo for this as it allows the array code to be iterated and released faster and independently to pint. maybe pint-array or pint-array-api?
alright! Extremely rough POC at https://github.com/lucascolley/pint-array:
In [1]: from pint import UnitRegistry
In [2]: from pint_array import get_namespace
In [3]: import array_api_strict as xp
In [4]: pxp = get_namespace(xp)
In [5]: ureg = UnitRegistry()
In [6]: x = pxp.asarray(xp.arange(6), units=ureg.meter)
In [7]: pxp.sum(x)
Out[7]: <Quantity(15, 'meter')>
In [9]: x = pxp.asarray(x, dtype=xp.float64)
In [10]: pxp.var(x)
Out[10]: <Quantity(2.9166666666666665, 'meter ** 2')>
In [11]: x = pxp.asarray(x, units=ureg.degC)
In [12]: pxp.sum(x)
---------------------------------------------------------------------------
OffsetUnitCalculusError Traceback (most recent call last)
Cell In[12], line 1
----> 1 pxp.sum(x)
File ~/programming/pint-array/pint_array/__init__.py:42, in get_namespace.<locals>.sum(x, axis, dtype, keepdims)
40 units = x.units
41 magnitude = xp.sum(x, axis=axis, dtype=dtype, keepdims=keepdims)
---> 42 units = (1 * units + 1 * units).units
43 return UnitRegistry.Quantity(magnitude, units)
File ~/programming/pint-array/.pixi/envs/default/lib/python3.13/site-packages/pint/facets/plain/quantity.py:849, in PlainQuantity.__add__(self, other)
846 if isinstance(other, datetime.datetime):
847 return self.to_timedelta() + other
--> 849 return self._add_sub(other, operator.add)
File ~/programming/pint-array/.pixi/envs/default/lib/python3.13/site-packages/pint/facets/plain/quantity.py:101, in check_implemented.<locals>.wrapped(self, *args, **kwargs)
99 elif isinstance(other, list) and other and isinstance(other[0], type(self)):
100 return NotImplemented
--> 101 return f(self, *args, **kwargs)
File ~/programming/pint-array/.pixi/envs/default/lib/python3.13/site-packages/pint/facets/plain/quantity.py:825, in PlainQuantity._add_sub(self, other, op)
823 units = other._units
824 else:
--> 825 raise OffsetUnitCalculusError(self._units, other._units)
827 return self.__class__(magnitude, units)
OffsetUnitCalculusError: Ambiguous operation with offset unit (degree_Celsius, degree_Celsius). See https://pint.readthedocs.io/en/stable/user/nonmult.html for guidance.
With a bit of work it should be relatively straightforward based on the work already done to wrap NumPy (I hope).
Thanks, that's very helpful to understand how it works.
Assuming the function signatures are the same for numpy asn the array API, it may be possible to change the implements decorator to also create functions for pint_namespace That would be fine to do in pint.
Assuming the function signatures are the same for numpy asn the array API, it may be possible to change the implements decorator to also create functions for pint_namespace That would be fine to do in pint.
Yes, it seems like we should be able to reuse a lot of that code. Given that NumpyWrapper supports functions (and ufuncs) which aren't in the array API standard, I think it will be easier to keep them separate for now.
I just pushed a change to the repo such that we can define an ArrayUnitQuantity, which is a pint.Quantity with additional methods from ArrayQuantity. Since the return value of the __array_namespace__ method is the pint_namespace which is itself defined using ArrayUnitQuantitys, I'm not sure yet how ArrayQuantity could become a standalone facet to integrate into pint.UnitRegistry. But maybe that isn't needed!
xref https://github.com/astropy/astropy/issues/17588
This is probably as good a moment as any to mention that @nstarman and I have been working on a new Quantity implementation that aims to be array API compatible and agnostic of the unit system - see https://github.com/astropy/quantity-2.0 and the design considerations at https://github.com/astropy/astropy-APEs/pull/91 (especially the PDF linked on top).
As our hope in part is to replace astropy's Quantity, our minimal goal is to provide full coverage of the array API; it would be the full numpy API for ndarray (i.e., including out arguments, specializations of ufuncs like np.add.{at, reduce, reduceat, accumulate}, and gufuncs.
Anyway, the part that is perhaps most relevant here is the unit-agnostic part - this would need some standardization of how units interact with arrays, perhaps new dunder methods of some agreement on operators (e.g., astropy can attach a unit via value << unit - maybe that could become a standard?)
A shared units & Quantity API would be fantastic!
I would vote for defining a quantity-api libraries with some runtime-checkable Protocols.
I mocked up something along these lines in https://github.com/nstarman/units/tree/main/src/units/api
@mhvk , when you say unit system, do you mean the unit module, pint or astropy.units etc?
the linked pdf refers to a Quantity API. If I am understanding it correctly, if pint, astropy.units and any other unit library followed it, then there could be a quantity-array library that works for both pint and astropy.units.
could then make pint-pandas and pint-xarray work for astropy and others too
Yep, that sounds perfect. The namespaces I'm building in pint-array just need an array library, a quantity class from which to inherit, and some units compatible with that quantity. Currently, that's pint.Quantity and pint.UnitRegistry, but it should be relatively simple to swap those out for a new standard if you folks can figure one out!
That's probably the hard part; in the meantime, I'll work on getting these array-quantity namespaces up to standard-compliance, while attempting to do the right thing with units. I could probably use some help writing tests for correct unit-handling at some point!
@lucascolley - spurred on by this conversation, I've pushed my implementation of ufucs/element-wise operations to https://github.com/astropy/quantity-2.0/pull/22 - tests/test_element_wise_functions.py may be fairly directly useful to you.
Yes, there aren't all that many classes and functions needed for the array API, based on pint's numpy array_function implementation. I think a Quantity API defining the classes and functions needed to write the pint-array library would be a good first version. Is https://github.com/astropy/quantity-2.0 the best place for such a discussion on a Quantity API?
There are a load of numpy tests that have been added as people implement numpy functions.
What do we think about creation-like functions such as ones_like? Should the result be dimensionless or have the same units as the input array? Pint currently returns a dimensionless array for NumPy arrays.
that was being changed to have the same units as the input array, but has got stuck https://github.com/hgrecco/pint/issues/882
Up until this point everything has been working with pint.UnitRegistry. I'm now trying to implement https://data-apis.org/array-api/latest/API_specification/generated/array_api.concat.html#array_api.concat, my implementation of which makes use of https://pint.readthedocs.io/en/stable/api/facets.html#pint.facets.plain.PlainQuantity.m_as. That takes us through to this generic _convert function: https://github.com/hgrecco/pint/blob/74b708661577623c0c288933d8ed6271f32a4b8b/pint/facets/plain/registry.py#L1087-L1097
Behavior is not specified when mixing a Python float and an array with an integer data type; this may give float32, float64, or raise an exception.
and so the multiplication here can throw an exception for integer arrays.
Do you think we could try to address this problem in the plain registry @andrewgsavage ? It will be a bit hacky without adding proper array API awareness to Pint, but I think something like
if hasattr(value, __array_namespace__) and value.__array_namespace__().isdtype(value.dtype, "integral"):
factor = int(factor)
might fix the integer array case.
I suppose the alternative is for me to subclass UnitRegistry and override the _convert method. But it might be nice to avoid that if possible?
The conversion factor should not be converted to an int - that would lead to things like 4 inches converting to 8cm instead of 10cm. Can this be handled within the concat implementation by special casing the integer arrays to check consistent units?
ah okay, so if consolidating the units would require a non-integral conversion factor and we have integer arrays, we can raise. Could you point me to a helper in Pint which checks for such a factor, if it exists already? Or otherwise, how would be best to go about writing it?
I don't think we have that. I'd either raise or return a float array. Do you want to use your repo for these discussions to keep this cleaner?
I was just made aware of @simonheybrock's https://pydims.github.io/pydims/index.html, which adds named dimensions and units to array API standard compatible arrays. It supports units from both astropy.units and pint, so it sounds like you should be involved in any discussion of standardising a unit/quantity API @simonheybrock.
Since pydims adds named dimensions, its arrays can't comply with the standard for the same reasons as xarray. So the scopes of pydims and (what is currently called) pint-array are orthogonal in that regard. pint-array should definitely look towards supporting both forms of units like pydims does, but perhaps that will be easier or superseded if we wait for the work that @nstarman and @mhvk have in mind.
alright, https://github.com/lucascolley/pint-array is now passing array-api-tests 🎉 ! The behaviour with units is currently untested and known to be quite buggy, but all of the building blocks are there.
I probably won't have time to work on proper tests, documentation etc. until this summer following my university exams, but contributions are very welcome!
Example of unit-aware integration now working with SciPy 1.15.0 (released yesterday)![^1]
~EDIT: I'm not sure the unit behaviour is correct yet here - I think we would want pint^2 for the first result and pint^3 for the second? EDIT 2: I think this is because I haven't correctly defined the units for __gt__ yet~
EDIT 3: fixed now!
In [1]: import pint_array; import numpy as np; pnp = pint_array.pint_namespace(np); from pint import UnitRegistry; ureg = UnitRegistry()
In [2]: from scipy.integrate import cubature
In [3]: def f(x, n):
...: """f(x) = n * x"""
...: return n*x
...:
In [4]: a = pnp.asarray([0], units=ureg.pint)
In [5]: b = pnp.asarray([1], units=ureg.pint)
In [6]: n = pnp.arange(10)
In [7]: res = cubature(
...: f,
...: a=a,
...: b=b,
...: args=(n,),
...: )
In [8]: res.estimate
Out[8]:
<Quantity(
array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5]),
'pint ** 2'
)>
In [9]: n.magnitude * (1**2) / 2 # closed form
Out[9]: array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5])
In [10]: res.error
Out[10]:
<Quantity(
array([0.00000000e+00, 9.02056208e-17, 1.80411242e-16, 3.46944695e-16,
3.60822483e-16, 1.11022302e-16, 6.93889390e-16, 2.49800181e-16,
7.21644966e-16, 1.66533454e-15]),
'pint ** 2'
)>
In [11]: def f(x, n):
...: """f(x) = n * (x * x)"""
...: return n * (x * x)
...:
In [12]: res = cubature(
...: f,
...: a=a,
...: b=b,
...: args=(n,),
...: )
In [13]: res.estimate
Out[13]:
<Quantity(
array([0. , 0.33333333, 0.66666667, 1. , 1.33333333,
1.66666667, 2. , 2.33333333, 2.66666667, 3. ]),
'pint ** 3'
)>
In [14]: n.magnitude * (1**3) / 3 # closed form
Out[14]:
array([0. , 0.33333333, 0.66666667, 1. , 1.33333333,
1.66666667, 2. , 2.33333333, 2.66666667, 3. ])
In [15]: res.error
Out[15]:
<Quantity(
array([0.00000000e+00, 2.77555756e-17, 5.55111512e-17, 3.33066907e-16,
1.11022302e-16, 4.44089210e-16, 6.66133815e-16, 2.49800181e-16,
2.22044605e-16, 3.33066907e-16]),
'pint ** 3'
)>
Still need to fix the unit behaviour of quite a few functions/methods, but this is promising :)
[^1]: you'll need to set the env var SCIPY_ARRAY_API=1
I was just made aware of @SimonHeybrock's https://pydims.github.io/pydims/index.html, which adds named dimensions and units to array API standard compatible arrays. It supports units from both
astropy.unitsandpint, so it sounds like you should be involved in any discussion of standardising a unit/quantity API @SimonHeybrock.Since pydims adds named dimensions, its arrays can't comply with the standard for the same reasons as xarray. So the scopes of pydims and (what is currently called) pint-array are orthogonal in that regard. pint-array should definitely look towards supporting both forms of units like pydims does, but perhaps that will be easier or superseded if we wait for the work that @nstarman and @mhvk have in mind.
Thanks for the shout-out @lucascolley! I would like to note that I am not actively working on PyDims. It was (for now) meant to gauge the communities interest in the approach, including in particular a "Python Units API Standard".
As an aside, there is now yet another Python units library, since @phlptp added Python bindings to https://github.com/LLNL/units.
@SimonHeybrock I have a made a topic proposing collabration on a Quantity API https://github.com/astropy/quantity-2.0/issues/23
Hi, do you have plan to support torch.Tensor as well?
you may be able to use https://data-apis.org/array-api-compat/ along with https://github.com/quantity-dev/quantity-array
yes, supporting PyTorch is part of the goal here. The current caveats on support are listed at https://data-apis.org/array-api-compat/supported-array-libraries.html#pytorch
@andrewgsavage @lucascolley Brillant wrok! Thanks for your information, and I can move my code to pint safely.
@andrewgsavage @lucascolley Brillant wrok! Thanks for your information, and I can move my code to pint safely.
if you do get this working, please post about it here. it would be great to see it in action!
@andrewgsavage @lucascolley Brillant wrok! Thanks for your information, and I can move my code to pint safely.
if you do get this working, please post about it here. it would be great to see it in action!
I will. But now it seems cannot carry pint unit with torch.tensor(e.g. torch.sqrt( Tensor * ureg.unit )), I have to calculate the conversion factor before passing them to module....
Once I have time I will figure out how to use it seamlessly
let's close this out as work is now ongoing over at @quantity-dev