pint icon indicating copy to clipboard operation
pint copied to clipboard

pint scalars have no shape, unlike numpy scalars

Open TomNicholas opened this issue 3 years ago • 8 comments

Pint scalars constructed from bare floats have no shape, unlike numpy scalars. For example:

In [13]: a = np.array(2.0)

In [14]: a.shape
Out[14]: ()

In [15]: q = pint.Quantity(2.0)

In [16]: q.shape
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
~/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/pint/quantity.py in __getattr__(self, item)
   1841         try:
-> 1842             return getattr(self._magnitude, item)
   1843         except AttributeError:

AttributeError: 'float' object has no attribute 'shape'

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
<ipython-input-16-0fad4e80c834> in <module>
----> 1 q.shape

~/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/pint/quantity.py in __getattr__(self, item)
   1842             return getattr(self._magnitude, item)
   1843         except AttributeError:
-> 1844             raise AttributeError(
   1845                 "Neither Quantity object nor its magnitude ({}) "
   1846                 "has attribute '{}'".format(self._magnitude, item)

AttributeError: Neither Quantity object nor its magnitude (2.0) has attribute 'shape'

This is a pretty significant departure from a numpy-like API, at least in terms of construction.

Creating a Quantity from a numpy array does create an object which has a shape, but the two differently-behaving objects have the same repr!!!

In [17]: q = pint.Quantity(np.array(2.0))

In [18]: q
Out[18]: array(2.) <Unit('dimensionless')>

In [19]: q.shape
Out[19]: ()

In [20]: repr(q)
Out[20]: "<Quantity(2.0, 'dimensionless')>"

In [21]: q = pint.Quantity(2.0)

In [22]: repr(q)
Out[22]: "<Quantity(2.0, 'dimensionless')>"

In [23]: q.shape
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
~/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/pint/quantity.py in __getattr__(self, item)
   1841         try:
-> 1842             return getattr(self._magnitude, item)
   1843         except AttributeError:

AttributeError: 'float' object has no attribute 'shape'

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
<ipython-input-23-0fad4e80c834> in <module>
----> 1 q.shape

~/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/pint/quantity.py in __getattr__(self, item)
   1842             return getattr(self._magnitude, item)
   1843         except AttributeError:
-> 1844             raise AttributeError(
   1845                 "Neither Quantity object nor its magnitude ({}) "
   1846                 "has attribute '{}'".format(self._magnitude, item)

AttributeError: Neither Quantity object nor its magnitude (2.0) has attribute 'shape'

This should not be possible - the repr should have a one-to-one relationship with the instance - and I would be interested in submitting a PR to help fix it.

I noticed this whilst trying to create an xarray.DataArray which wraps a quantified scalar:

In [24]: xr.DataArray(pint.Quantity(2.0))
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
~/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/pint/quantity.py in __getattr__(self, item)
   1841         try:
-> 1842             return getattr(self._magnitude, item)
   1843         except AttributeError:

AttributeError: 'float' object has no attribute 'shape'

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
<ipython-input-24-dadd2575532d> in <module>
----> 1 xr.DataArray(pint.Quantity(2.0))

~/Documents/Work/Code/xarray/xarray/core/dataarray.py in __init__(self, data, coords, dims, name, attrs, indexes, fastpath)
    412             data = _check_data_shape(data, coords, dims)
    413             data = as_compatible_data(data)
--> 414             coords, dims = _infer_coords_and_dims(data.shape, coords, dims)
    415             variable = Variable(dims, data, attrs, fastpath=True)
    416             indexes = dict(

~/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/pint/quantity.py in __getattr__(self, item)
   1842             return getattr(self._magnitude, item)
   1843         except AttributeError:
-> 1844             raise AttributeError(
   1845                 "Neither Quantity object nor its magnitude ({}) "
   1846                 "has attribute '{}'".format(self._magnitude, item)

AttributeError: Neither Quantity object nor its magnitude (2.0) has attribute 'shape'

In [25]: xr.DataArray(pint.Quantity(np.array(2.0)))
Out[25]: 
<xarray.DataArray ()>
<Quantity(2.0, 'dimensionless')>

(@keewis you probably want to see this too)

(This is with pint version 0.17)

TomNicholas avatar Jul 03 '21 16:07 TomNicholas

This relates to a perennial discussion point in Pint about handling of scalars and arrays (e.g., https://github.com/hgrecco/pint/issues/1128, https://github.com/hgrecco/pint/issues/950, https://github.com/hgrecco/pint/issues/753) that hasn't seen a good resolution yet. In short, this seems to just be a natural consequence of Pint Quantities being a flexible wrapper class that are equally capable of working with and without NumPy--Quantities wrapping Python scalars behave more like Python scalars and Quantities wrapping NumPy scalars behave like NumPy scalars. Though, is this still an issue you encounter when using a Pint registry with force_ndarray_like=True?

All that being said, I definitely agree about the repr. That would be a good fix!

jthielen avatar Jul 03 '21 16:07 jthielen

Oh I don't know how I didn't see #950 when I searched for previous issues! I think my issue is basically a duplicate of that one.

Though, is this still an issue you encounter when using a Pint registry with force_ndarray_like=True?

Kind of? Depends on whether I create the quantity with pint or ureg:

In [33]: from pint import UnitRegistry
    ...: ureg = UnitRegistry(force_ndarray_like=True)

In [34]: q = ureg.Quantity(2.0)

In [35]: q.shape
Out[35]: ()

In [36]: q = pint.Quantity(2.0)

In [37]: q.shape
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
~/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/pint/quantity.py in __getattr__(self, item)
   1841         try:
-> 1842             return getattr(self._magnitude, item)
   1843         except AttributeError:

AttributeError: 'float' object has no attribute 'shape'

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
<ipython-input-37-0fad4e80c834> in <module>
----> 1 q.shape

~/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/pint/quantity.py in __getattr__(self, item)
   1842             return getattr(self._magnitude, item)
   1843         except AttributeError:
-> 1844             raise AttributeError(
   1845                 "Neither Quantity object nor its magnitude ({}) "
   1846                 "has attribute '{}'".format(self._magnitude, item)

AttributeError: Neither Quantity object nor its magnitude (2.0) has attribute 'shape'

That's not really ideal - force_ndarray_like is really a global choice about the behaviour of pint, it's not a choice specific to one unit registry...

All that being said, I definitely agree about the repr. That would be a good fix!

What would be the fix though? If there was a QuantityArray class that would distinguish the reprs of the two objects, but barring that how else I am supposed to distinguish them? Unless you want to change the reprs to <Quantity(2.0, 'dimensionless')> and <Quantity(np.array(2.0), 'dimensionless')>?

I personally don't really see what the point of diverging from numpy's interface at all is.

TomNicholas avatar Jul 03 '21 16:07 TomNicholas

That's not really ideal - force_ndarray_like is really a global choice about the behaviour of pint, it's not a choice specific to one unit registry...

I suppose that this does effectively solve the problem with xarray though: pint-xarray can always require force_ndarray_like=True... But looks like @keewis is several steps ahead of me!

TomNicholas avatar Jul 03 '21 16:07 TomNicholas

Kind of? Depends on whether I create the quantity with pint or ureg:

...

That's not really ideal - force_ndarray_like is really a global choice about the behaviour of pint, it's not a choice specific to one unit registry...

I can definitely see that, but from what I recall every choice in Pint is on the registry level...it just happens to be that using the global Quantity class uses the application registry, which has a default, but otherwise can be configured with set_application_registry (xref https://github.com/xarray-contrib/pint-xarray/issues/7, https://github.com/hgrecco/pint/pull/880).

All that being said, I definitely agree about the repr. That would be a good fix!

What would be the fix though? If there was a QuantityArray class that would distinguish the reprs of the two objects, but barring that how else I am supposed to distinguish them? Unless you want to change the reprs to <Quantity(2.0, 'dimensionless')> and <Quantity(np.array(2.0), 'dimensionless')>?

I was guessing including the type of the magnitude in some form, a la Dask, so maybe <Quantity(2.0, units='dimensionless', type=float)> and <Quantity(2.0, units='dimensionless', type=numpy.ndarray)>? Though that might be misleading too since type isn't an argument to the constructor.

I personally don't really see what the point of diverging from numpy's interface at all is.

As a user who's only worked with Pint alongside NumPy, I would tend to agree, however, I recognize that the library has been deliberately designed for use without NumPy. But, with the current state of the ecosystem, maybe that's something that merits more discussion? What are the maintainers' thoughts here?

jthielen avatar Jul 03 '21 17:07 jthielen

I can definitely see that, but from what I recall every choice in Pint is on the registry level...

I suppose. Conceptually I find that a bit confusing (because how you treat scalar arrays has nothing to do with what system of units you use), but I see why it's like this.

I was guessing including the type of the magnitude in some form, a la Dask, so maybe <Quantity(2.0, units='dimensionless', type=float)> and <Quantity(2.0, units='dimensionless', type=numpy.ndarray)>? Though that might be misleading too since type isn't an argument to the constructor.

That would be a very clear representation, but it also doesn't really follow the convention of the repr being executable.

TomNicholas avatar Jul 03 '21 17:07 TomNicholas

My thoughts about some of the raised issues are the following:

Global vs Registry level flags: I don't like global variables as I think it makes things more difficult to reason about. So in pint we try to make stateless functions when state is need we keep in instances (not globally). In a way, Pint started from this idea as the registry is not module (as was in other libraries at the time) but rather an object that parses a (text) file. Having registy level flags is just setting the defaults for the registry to interact with the user. force_ndarray, force_ndarray_like, auto_reduce_dimensions, non_int_type, case_sensitive tells the registry how to interpret user input.

Other ways to speficy flags (not just force ndarray): We can create a sort of context manager to be able to change this flages for a block of code. Something like:

with ureg.force_ndarray():
    # codes goes here

or more general

with ureg.with_flags(force_ndarray=True, case_sensitive=False):
    # codes goes here

I have been recently playing with contextvars to make Breadcrumbs. I is easy to use, robust and could work for this.

Requiring NumPy for Pint The choice of making Pint independent of numpy comes from the time in which wheels and conda were only a dream. NumPy was not available as binary in some platforms and I had to recompile SciPy to install it. Things have indeed changed, in a good way. However, I think that before requiring numpy it would be better to explore the old idea of making a QuantityScalar and QuantityArray class. My understanding is that this will help downstream projects that already have this distinction. And also will make it to require NumPy in the future but also not requiring it (I think).

repr: I agree with the criticism. I like Alex Martelli's views in this post. I would agree with any PR that goes in this direction.

hgrecco avatar Jul 08 '21 21:07 hgrecco

@jthielen We can change repr to just use the repr of the magnitude it's easy enough & just adding units to the repr.


<Quantity(array(2.), 'dimensionless')>
<Quantity(array(2., dtype=float32), 'dimensionless')>
<Quantity(array([2., 3., 4., ..., 2., 3., 4.]), 'dimensionless')>
<Quantity(dask.array<array, shape=(3,), dtype=float64, chunksize=(3,), chunktype=numpy.ndarray>, 'dimensionless')>
<Quantity(2.932515646146465, 'dimensionless')>
<Quantity(Decimal('2.9325156461464651456456545649'), 'dimensionless')>

using units= might be a good idea too.

jules-ch avatar Mar 02 '22 15:03 jules-ch

Revisiting this issue just to point out that if pint conformed to the python array API standard (NEP 47, see https://github.com/hgrecco/pint/issues/1592), then it would imply an opinionated fix for this issue (at least in the context of pint-xarray solving #216).

TomNicholas avatar Dec 13 '23 21:12 TomNicholas