ndcube icon indicating copy to clipboard operation
ndcube copied to clipboard

Sliceable MetaData Class

Open DanRyanIrish opened this issue 4 years ago • 23 comments

Description

This PR introduces a sliceable Meta class based on dict, which allows metadata to be associated with data axes and the whole class to be sliced using the normal Python slicing API. Metadata not associated with any axes is not touched by the slicing mechanism.

This PR introduces two types of axis-aware metadata: axis-aligned and grid-aligned. Axis-aligned metadata gives one value for each of its associated axes, and is scalar or a string if only associated with one axis. This type of metadata is only changed if the axis/axes within which it is associated is dropped by slicing. In this case, the values corresponding to the dropped axes are also dropped.

Grid-aligned metadata provides a value for each data array element in its associated axis/axes. This metadata cannot be scalar. Since this metadata mirrors at least part of the data array, it is sliced as if it were data, and its shape is kept consistent with its associated data axes. If all its axes are sliced away, the scalar value at the row/column location at which it was sliced is kept, and its axis-awareness is dropped, i.e. it becomes a piece of normal non-axis-aware scalar metadata.

API Summary

>>> import astropy.units as u
>>> from ndcube.meta import Meta

# Define a Meta object associated with data of shape (3, 4, 5)
>>> header = {"salutation": "hello",
              "name": "world",
              "exposure_time": ([2] * 3) * u.s,  # 1-dimension grid-aligned (pixel-wise) metadata
              "instrument_axes": ["slit step", "slit", "dispersion"],  # axis-aligned (scalar/string-per-axis) metadata
              "pixel_efficiency": np.random.ones(4, 5)  # multi-dimension grid-aligned (pixel-wise) metadata}
>>> comments = {"salutation": "This is polite."}
>>> axes = {"exposure_time": 0,
              "instrument_axes": (0, 1, 2),
              "pixel_efficiency": (1, 2)}
>>> meta = Meta(header, axes=axes, comments=comments, data_shape=(3, 4, 5))

# Use slice item that you would apply to the cube on the Meta object.
>>> sliced_meta = meta[0:2, 2]  # Reduce length of 0th axis and drop 1st axis.
>>> sliced_meta.shape  # Shape has been updated.
(2, 5)
>>> sliced_meta["exposure_time"]  # Exposure time now length of new 0th axis
<Quantity [2., 2.] s >
>>> sliced_meta["instrument_axes"]  # instrument axes has dropped slit entry
["slit step", "dispersion"]
>>> sliced_meta.axes["instrument_axes"]
(0, 1)
>>> sliced_meta["pixel_efficiency"]  # pixel efficiency is now 1-D of length 5
array([1., 1., 1., 1., 1.])
>>> sliced_meta.axes["pixel_efficiency"]
(1,)

# Rebin the Meta object
>>> rebinned_meta = meta.rebin((1, 2, 1))
>>> rebinned_meta["exposure_time"]  # grid-aligned metadata associated with non-rebinned axes remains unchanged...
<Quantity [2., 2., 2.] s >
>>> rebinned_meta.axes["exposure_time"]  # ...and still associated with its axis/axes
(0,)
>>> rebinned_meta["pixel_efficiency"]  # grid-aligned metadata associated with rebinned axes remains...
np.random.ones(4, 5)
>>> rebinned_meta.axes["pixel_efficiency"]  # ...but grid-awareness is removed
None
>>> rebinned_meta["instrument_axes"]  # axis-aligned (scalar/string-per-axis) metadata unchanged...
["slit step", "slit", "dispersion"]
>>> rebinned_meta.axes["instrument_axes"]  # ...and still associated with axes as they haven't been dropped
(0, 1, 2)

To Do

  • [x] Implement __delitem__ and remove remove method.
  • [x] Implement setter for NDMeta.data_shape which verifies that it's compatible with pre-existing axis- and grid-aligned metadata.
  • [x] Improve narrative docs

DanRyanIrish avatar Aug 14 '21 01:08 DanRyanIrish

Hello @DanRyanIrish! Thanks for updating this PR.

Line 71:70: W504 line break after binary operator

Line 136:38: E721 do not compare types, use 'isinstance()' Line 118:67: W504 line break after binary operator

Comment last updated at 2021-11-18 14:32:19 UTC

pep8speaks avatar Aug 14 '21 01:08 pep8speaks

@nabobalis Could you give this PR a review? It should be very similar if not the same as what's in your sunraster PR.

DanRyanIrish avatar Nov 04 '21 10:11 DanRyanIrish

@Cadair this PR is ready for review. To pre-empt possible concerns regarding DKIST tools, the introduction of this new class does not require that metadata be sliceable. Instead it allows for it if desired.

In order for the Meta object to be sliceable, Meta.shape must be set via the data_shape kwarg during intialization. This must be set to the shape of the data array with which the metadata object is associated. If Meta.shape is None, the object is not sliceable.

Not all entries in the Meta object have to be sliceable. Only those who have a corresponding entry in the Meta.axes dict will be. This gives the axes with which the metadata is associated and is therefore used to work out how to slice it. If Meta.shape is set, but there are no entries in Meta.axes, then slicing has no effect on the Meta object.

Therefore, as far as I understand, this new class should not threaten your assumption in the DKIST tools that .meta never changes.

DanRyanIrish avatar Nov 04 '21 11:11 DanRyanIrish

@Cadair, this is ready to merge as far as I can see. Do you still want more time to look at it given your DKIST concerns? As I above, I don't think this causes an issues as the new functionalities are optional and don't have to change previous behaviour.

DanRyanIrish avatar Nov 10 '21 20:11 DanRyanIrish

@Cadair: I've implemented the change to slicing meta that we discussed, https://github.com/sunpy/ndcube/pull/455/commits/cf923f9ce9d45ea9921a544b33b05262ab56a313, i.e. NDCube only slices the meta if it has the magic attribute __ndcube_can_slice__. Any more concerns before merging this?

DanRyanIrish avatar Nov 18 '21 14:11 DanRyanIrish

This pull request has been automatically marked as stale because it has not had any activity for the past five months. It will be closed if no further activity occurs. If the ideas in this pull request are still worth implementing, please make sure you open an issue to keep track of that idea!

stale[bot] avatar Apr 17 '22 15:04 stale[bot]

Hello again :wave:, We want to thank you again for your contribution to ndcube! This pull request has had no activity since my last reminder, so I am going to close it. If at any time you want to come back to this please feel free to reopen it! If you want to discuss this, please add a comment with: @sunpy/ndcube-developers and someone will get back to you soon.

github-actions[bot] avatar May 18 '22 05:05 github-actions[bot]

@Cadair, went looking for this functionality for another package looking to use it and realised we never actually merged it and the stale bot took it. Any objections to merging it now, pending a couple minor changes to make the tests pass? As the previous comments pointed out, the use of this new Metadata class is optional and so doesn't disrupt any current users' pipelines.

DanRyanIrish avatar May 07 '24 13:05 DanRyanIrish

Anyone know what's causing this warning in the doc build? The traceback is not helpful. @nabobalis?

DanRyanIrish avatar May 07 '24 13:05 DanRyanIrish

So my current thinking is that the doc build is failing due to this:

  1. Meta subclass dict, inheriting its methods.
  2. The methods pop and update (which we don't override here) have code links in them that break our doc build (due to nitpicky).
  3. We have to just ignore these in this file: https://github.com/sunpy/ndcube/blob/main/docs/nitpick-exceptions if you see the yle there, we just need to follow the same but with the message from the doc build.

nabobalis avatar May 07 '24 16:05 nabobalis

This PR should probably also make NDCube.rebin check if the .meta is an instance of this metadata class and if so, just drop the axis-awareness, without removing the metadata itself.

DanRyanIrish avatar May 09 '24 16:05 DanRyanIrish

This PR should probably also make NDCube.rebin check if the .meta is an instance of this metadata class and if so, just drop the axis-awareness, without removing the metadata itself.

Alternatively, how to handle rebinning could to moved onto the Meta class itself via Meta.rebin, and then NDCube.rebin could check if the .meta attribute has a .__ndcube_can_rebin__ attribute, which, if True, calls Meta.rebin. This is the same strategy used for slicing the Meta class. This would enable subclasses of Meta to implement rebinning differently, say, if it applied to well constrained metadata for which rebinning has a clear meaning.

DanRyanIrish avatar May 13 '24 07:05 DanRyanIrish

Documentation of what axis-aware metadata this can handle needs updating.

DanRyanIrish avatar May 20 '24 00:05 DanRyanIrish

@nabobalis The formalisation of "axis-aligned" metadata introduced in this PR, may remove much of the need for sunraster's data classes. instrument_axes could simply become a piece of axis-aligned metadata on the Meta object.

DanRyanIrish avatar May 20 '24 11:05 DanRyanIrish

@nabobalis The formalisation of "axis-aligned" metadata introduced in this PR, may remove much of the need for sunraster's data classes. instrument_axes could simply become a piece of axis-aligned metadata on the Meta object.

That sounds great.

After we merge this in and we get a release out, should we meet to discuss what to do with sunraster?

nabobalis avatar May 20 '24 13:05 nabobalis

After we merge this in and we get a release out, should we meet to discuss what to do with sunraster?

I'd be happy to have that discussion. We could inventory what sunraster now provides in light of this PR other recent-ish functionalities. My sense is the following:

  • spice module can be moved to sunkit-instruments
  • exposure_time_correction can be removed, as users can now directly divide the cube by the exposure time correction thanks to arithmetic operations.
  • instrument_axes can be put onto the .meta as an axis-aligned piece of metadata.

This leaves the

  • solar slit spectrograph metadata API
    • I think this really belongs in sunpy core, but should be part of a wider metadata API policy.
  • The RasterSequence, whose main power is its interchangeable 4-D/3-D with an instrumentally meaningful API
    • I don't foresee this living anywhere other than on a slit-spectrograph-specific subclass of NDCube.
  • Coordinate properties, e.g. spectral_axis.
    • Again, I don't see this living anywhere other than on a slit-spectrograph-specific subclass of NDCube. Unless something clever can be done with a Coords class that has been mentioned before. Even so, I don't see this in the foreseeable future.
    • Alternatively, the same result can be achieved slightly less conveniently through axis_world_coords.

DanRyanIrish avatar May 20 '24 14:05 DanRyanIrish

After we merge this in and we get a release out, should we meet to discuss what to do with sunraster?

I'd be happy to have that discussion. We could inventory what sunraster now provides in light of this PR other recent-ish functionalities. My sense is the following:

  • spice module can be moved to sunkit-instruments

We should ask if the spice team want it in their spice package.

  • exposure_time_correction can be removed, as users can now directly divide the cube by the exposure time correction thanks to arithmetic operations.

Agreed.

  • instrument_axes can be put onto the .meta as an axis-aligned piece of metadata.

Agreed.

This leaves the

  • solar slit spectrograph metadata API

    • I think this really belongs in sunpy core, but should be part of a wider metadata API policy.

This is tricky, do we think we have any time in this year to try and work on this and get a wider metadata API in core/somewhere?

  • The RasterSequence, whose main power is its interchangeable 4-D/3-D with an instrumentally meaningful API

    • I don't foresee this living anywhere other than on a slit-spectrograph-specific subclass of NDCube.
  • Coordinate properties, e.g. spectral_axis.

    • Again, I don't see this living anywhere other than on a slit-spectrograph-specific subclass of NDCube. Unless something clever can be done with a Coords class that has been mentioned before. Even so, I don't see this in the foreseeable future.
    • Alternatively, the same result can be achieved slightly less conveniently through axis_world_coords.

If sunraster became just these two items, I would be ok with it but it feels like the package would be on its last legs. I would be ok with that, it would not require much maintenance to keep it going with those items.

nabobalis avatar May 20 '24 15:05 nabobalis

  • spice module can be moved to sunkit-instruments

We should ask if the spice team want it in their spice package.

That would be even better.

  • I think this really belongs in sunpy core, but should be part of a wider metadata API policy.

This is tricky, do we think we have any time in this year to try and work on this and get a wider metadata API in core/somewhere?

I don't know, but I think it should be a priority for sunpy. I banged the drum on this topic a good while ago, but didn't stay on it. I think we need a renewed effort, but should give some thought about how to get more people interested.

  • The RasterSequence, whose main power is its interchangeable 4-D/3-D with an instrumentally meaningful API

    • I don't foresee this living anywhere other than on a slit-spectrograph-specific subclass of NDCube.
  • Coordinate properties, e.g. spectral_axis.

    • Again, I don't see this living anywhere other than on a slit-spectrograph-specific subclass of NDCube. Unless something clever can be done with a Coords class that has been mentioned before. Even so, I don't see this in the foreseeable future.
    • Alternatively, the same result can be achieved slightly less conveniently through axis_world_coords.

If sunraster became just these two items, I would be ok with it but it feels like the package would be on its last legs. I would be ok with that, it would not require much maintenance to keep it going with those items.

I think you're right that a package for 2 features would feel overkill. But those two features are useful. I think we'd need to think more about the future of sunraster at that point.

DanRyanIrish avatar May 20 '24 19:05 DanRyanIrish

  • spice module can be moved to sunkit-instruments

We should ask if the spice team want it in their spice package.

That would be even better.

As far as I understand, there is no SPICE-specific code in NDCube, so do you mean the SpectrogramCube class derived from NDCube, and the SPICE reader function, both from sunraster?

As we have already discussed, the answer would probably be yes, but is a SpectrogramCube close enough to a NDCube so that it would be easy for us to maintain in sospice? Could the SPICE-specific SpectrogramCube be less generic than the one currently in sunraster? And what about SpectrogramSequence?

ebuchlin avatar May 20 '24 20:05 ebuchlin

Hi @ebuchlin. This is a confusing place to have this conversation, which is my fault. Apologies. I did not mean that functionality should be moved from ndcube to sospice. I was suggesting passing the following to sospice from sunraster.

  • the spice module, including the spice FITS reader and SPICEMeta class.

This would leave two main functionalities in sunraster:

  • The generic metadata API definitions in the meta.py module on which the SPICEMeta class depends;
  • The SpectrogramCube, SpectrogramSequence, and RasterSequence data classes.

DanRyanIrish avatar May 23 '24 10:05 DanRyanIrish

@Cadair, as requested, an explainer on the API/behaviour has been added to the PR description.

DanRyanIrish avatar Jun 19 '24 17:06 DanRyanIrish

@Cadair, except for narrative docs, I think this PR now addresses all but one minor comment of yours. Unless you'd like to look at it again, I will merge it once I've added narrative docs.

DanRyanIrish avatar Jul 03 '24 08:07 DanRyanIrish

Thanks for the review @Cadair. I've addressed your comments with a couple exceptions where I ask for further clarifications.

DanRyanIrish avatar Jul 04 '24 13:07 DanRyanIrish