clifford icon indicating copy to clipboard operation
clifford copied to clipboard

Improve MVArray

Open hugohadfield opened this issue 4 years ago • 4 comments

MVArray is very useful but a little lacking in functionality. My current wishlist would include:

  • [ ] Nd multivector arrays to and from Nd numpy arrays with the coefficient arrays on the last axis
  • [ ] Reduction products for gp and op along specified dimensions
  • [ ] Optimised elementwise ops on mvarrays that sidestep the runtime type and layout checking

Given the recent work on JITing MultiVectors perhaps we should also look at doing something similar to the MVArray?

hugohadfield avatar Jun 23 '20 09:06 hugohadfield

to and from Nd numpy arrays with the coefficient arrays on the last axis

I got some advice on the numpy issue tracker to consider using a type like dtype([('value', (float, 32))]), which makes it much harder to accidentally work with an array when you expect a multivector. I believe this came from @mhvk regarding how things are handled in Astropy.

eric-wieser avatar Jun 23 '20 09:06 eric-wieser

Indeed, I think for, say, 3-D vectors I'd recommend something like np.dtype([('vector', '3f8')]) or perhaps np.dtype([('x', 'f8'), ('y', 'f8'), ('z', 'f8')]). The huge advantage is that the shape is automatically correct, so indexing and broadcasting work easily.

In astropy, we do not use that for vectors (instead we store separate x, y, z, as it is part of the coordinate representation machinery that also allows transforming to longitute, latitude, distance, etc.

But we do use it for monte-carlo error propagation, where each element has a large number of randomly drawn samples, which can then get propagated; here again it is very useful that the shape of the distribution instances is correct. See https://docs.astropy.org/en/latest/uncertainty/index.html and https://docs.astropy.org/en/latest/_modules/astropy/uncertainty/core.html#Distribution, in particular the view taken in __new__ using

         new_dtype = np.dtype({'names': ['samples'],
                              'formats': [(samples.dtype, (samples.shape[-1],))]})

(the implementation of __array_ufunc__ may also be relevant)

mhvk avatar Jun 23 '20 13:06 mhvk

In astropy, we do not use that for vectors (instead we store separate x, y, z, as it is part of the coordinate representation machinery that also allows transforming to longitute, latitude, distance, etc.

What dtype does this end up storing things in under the hood? Your concept of a coordinate representation shares some features of our layout concept, so we might want to play a similar trick.

eric-wieser avatar Jun 23 '20 13:06 eric-wieser

The representations generally store x, y, z as individual Quantity arrays (i.e., they carry a unit); there is some special code in place that for an array with dimension 3 somewhere, it will just store appropriate views.

For Cartesian, a specific dimension for x,y,z makes a lot of sense, but it is trickier for spherical, as we do not yet have support for different units on items in a structured dtype (it is in the works; it is one place where it is clear that having the unit as part of the dtype makes a lot of sense...)

Aside: if you're not subclassing ndarray, you'll probably need to implement a lot of numpy-like methods; for anything shape-related, you may have a look at our astropy.utils.shapes, which centralizes that quite handily (I just added support for __array_function__ so things like np.broadcast_to work on subclasses -- which for us is representations, coordinates, times). It is probably something that would make sense to have in numpy proper, to join the arithmetic mixin (piping everything through __array_function__ rather than _apply, probably).

mhvk avatar Jun 23 '20 15:06 mhvk