YAXArrays.jl icon indicating copy to clipboard operation
YAXArrays.jl copied to clipboard

wrong metadata when reading with `Cube` function

Open Sonicious opened this issue 1 year ago • 3 comments

using Zarr
using YAXArrays

da = open_dataset("http://data.rsc4earth.de/EarthSystemDataCube/v3.0.2/esdc-8d-0.25deg-1x720x1440-3.0.2.zarr")
da.kndvi.properties["long_name"]

da = Cube("http://data.rsc4earth.de/EarthSystemDataCube/v3.0.2/esdc-8d-0.25deg-1x720x1440-3.0.2.zarr")
da[Variable = At("kndvi")].properties["long_name"]

These two give different results. Is the metadata read differently or do I have an understanding problem about this?

Sonicious avatar Jun 06 '24 08:06 Sonicious

Metadata handling in the Cube funtion is wrong. I think it just copies the metadata of the first cube in the list. The problem is, we can only assign one metadata object to the whole cube and a dataset can have global metadata and metadata for every cube.

felixcremer avatar Jun 06 '24 08:06 felixcremer

Yes, I am more and more convinced that the Cube function should be deprecated in favor of a combination of open_dataset and some to_array to make things more explicit. Alternatively we could think about metadata schemes that are some kind of DimArrays themselves so that some metadata entries can differs for different values along a dimension (hereVariable- dimension) so that metadata can be conserved during concatenation of YAXArrays along a new axis. However, this is not something YAXArrays has to define alone, maybe @rafaqz already has a solution for this in DimensionalData we might re-use.

meggart avatar Jun 06 '24 18:06 meggart

Yeah, this is stuff that Raster.jl does already with RasterStack/Raster, mostly defined on AbstractDimStack/AbstractDimArray.

A DimStack has its own metadata and a NamedTuple of metadata for all the layers, and those are attached to a DimArray when you index it by name from the DimStack.

rafaqz avatar Jun 06 '24 20:06 rafaqz