iris
iris copied to clipboard
MeshCoord metadata should include the "mesh" property
We initially chose to leave the 'mesh' property out of the MeshCoordMetadata, largely because at that time we were wary of implementing equivalence on Mesh objects. As noted here
IIRC this was largely due to concerns about the efficiency of calculating mesh equality (i.e. potentially comparing large arrays), but we since essentially changed our minds about this, and implemented full equality testing on meshes anyway
So, in all probability, it would make more sense now to reverse this, and make 'mesh' a participant in MeshCooord.metadata
Present effects
With same 'location' and 'axis' but different 'mesh', we can find that meshcoord1 != meshcoord2, while at the same time meshcoord1.metadata == meshcoord2.metadata. This is not a logical flaw (can be true for coords with different data), but we would expect mesh to participate + it does not.
Arguably, metadata comparison should work as an 'identity equivalence' for data components,
this allows e.g. co = cube1.coord('longitude') ; co2 = cube2.coord(co) to work even when the 2 coords in question have different data, shapes or dtypes. This aspect is basically not working for meshcoords, at this point.
The code here would not be reachable, or would not be needed, if this worked as it does for the other coordinate types
-- instead, the resolve operation "ought to" be able to decide whether 2 meshcoords can be treated as equivalent or not.
It might also make sense to permit some aspects of lenient mesh comparison within this -- e.g. if ~equivalent meshes from different files had component with different var_names -- which is not possible at present.
At present, cubes can be merged between meshes that compare equal, but it requires an exact match in all details (e.g. including var_names)
Similar considerations could also apply to merge+concatenate.
Effects of change
It's not clear, if we simply change this, what knock-on problems could be introduced. Possibly, merging 'equal' meshes, loaded from different files, would stop working (it currently works).
Up to now, no other metadata component has been a complex object -- the most complex is attribute dictionaries, which themselves require special consideration in comparison/combination.
In order to maintain a backlog of relevant issues, we automatically label them as stale after 500 days of inactivity.
If this issue is still important to you, then please comment on this issue and the stale label will be removed.
Otherwise this issue will be automatically closed in 28 days time.
Note this comment, which will need removing even if we decide no action:
https://github.com/SciTools/iris/blob/8604bb80977cb92bfe87d2486315311094219a31/lib/iris/experimental/ugrid/metadata.py#L263-L269
After some thought
TL;DR: I now think we shouldn't do this after all.
I think metadata is generally to be viewed as analagous to "phenomenon" As on cubes or coordinates, for instance ..
- it distinguishes one thing from another, where directly combining/comparing them would be a categorical mistake, like a units error
- (i.e. like adding apples to elephants)
- it characterises the "meaning" of the numbers stored the object, e.g. the numbers are "a longitude" or "a height in metres".
- so in that way, it's a bit like an extended units concept
So for a cube, the metadata is those characteristics which give it a distinct identity as a "type of value" -- but not the values themselves (.data), their type (.dtype) or structure (.shape) At present, this is mostly used in 3 places:
- in comparison (==): But here, it is just a fast check, as non-metadata properties are generally also compared anyway
- in arithmetic : but this doesn't really apply to MeshCoords
- in selecting objects ... read on ...
For selection, component access methods like Cube.coords use metadata as an identity, so you can ...
co = cube1.coord(name)
meta = co.metadata
assert cube2.coord(co).metadata == cube2.coord(meta).metadata == meta
That is, you can get the equivalent coord in a different cube by using the coord (or equivalently, in fact, its metadata) as a reference. This works even when the 2 cubes have different dimensions or coordinate values, because those aspects are outside the metadata.
So, at present, we can do the same with MeshCoords..
x_co = cube1.coord(cube2.coord(mesh_coords=True, axis="x"))
But if we added the mesh to the metadata, this would not be possible, as you could no longer use it to pick meshcoords with similar properties but on a different mesh.
So I think that, with that in mind, adding a mesh to the MeshCoordMetadata would be like adding a cube's coords to its metadata : It would mean that the values and grid-locations of its content become part of the identity, so meshcoords on different meshes can't be matched. Which I think is therefore at odds with how it works for cubes and coords, so we shouldn't do this..
@stephenworsley please "review", i.e. see what you think of the above ideas. Do you agree that we should close this ?
@pp-mo This makes sense, I agree that the status of the mesh property should stay as is given that:
- metadata is a "lightweight" description and meshes are large.
meshalready exists as a property ofMeshCoords.- meshes are more similar to objects which are not considered suitable for including in other metadata (e.g. coords, data) than they are to objects included in metadata (e.g. names, attributes).
- adding meshes to metadata would break some existing ehaviour.
Ok, thanks @stephenworsley . I think we are done with this. Anyone disagreeing can re-open !