iris `MeshCoord` constructor always uses metadata of node coordinate regardless of `location`

`MeshCoord` constructor always uses metadata of node coordinate regardless of `location`

Open schlunma opened this issue 1 year ago • 5 comments

The MeshCoord constructor always uses the metadata (standard_name, long_name, units, etc.) of the node coordinate, regardless of the location that is specified:

https://github.com/SciTools/iris/blob/2cbaec747820410cce41db004a41e68d6ff63ee8/lib/iris/experimental/ugrid/mesh.py#L2845-L2853

That feels really weird to me and can lead to unexpected behavior. For example, if you compare the original face coordinate that is used to construct a Mesh and compare it to the output of Mesh.to_MeshCoord you get different results:

# define node coordinates
(node_lat, node_lon) = ...

# define face coordinates with different metadata
(face_lat, face_lon) = ...

# construct mesh with it
mesh = Mesh(                                                                                                                                                                                                                                                           
            topology_dimension=2,                                                                                                                                                                                                                                              
            node_coords_and_axes=[(node_lat, 'y'), (node_lon, 'x')],                                                                                                                                                                                                           
            connectivities=[connectivity],                                                                                                                                                                                                                                     
            face_coords_and_axes=[(face_lat, 'y'), (face_lon, 'x')])

# extract MeshCoord
mesh_face_lat = mesh.to_MeshCoord('face', 'y')

Here, the metadata of face_lat and mesh_face_lat differ since the metadata of face_lat and node_lat differ.

I am not sure if this is an intended feature or not, so please feel free to just close this if this is intentional.

Cheers :+1:

Iris version: 3.2.1.post0

Jul 12 '22 16:07 schlunma

@schlunma Thanks for taking the time to report this, much appreciated!

I'm particularly excited simply because it means that you're using our experimental ugrid support, which is brilliant :star_struck:

We're mostly all busy this week attending SciPy 2022 (13-15 July), but we'll get back to you asap :+1:

I'll class this issue as a bug, until we know otherwise.

Jul 13 '22 08:07 bjlittle

Thanks @Bill, that's great!

Yes, I've used it quite extensively in the last couple of weeks (also in combination with iris-esmf-regrid), and I'm really happy with it :+1: Thanks for all the great work on this :rocket:

Jul 13 '22 08:07 schlunma

I am not sure if this is an intended feature or not, so please feel free to just close this if this is intentional.

Well, it kind-of is intended this way, but we can be open to persuasion ... The logic for this was that the "number 1 purpose" of meshcoords is to provide a location-indexed arrays of location bounds -- since that information is otherwise not easily available. The bounds are always values of the node_coords, hence for the bounds (at least) it makes some sense to say that the applicable metadata (e.g. the units) are those from the node_coords variables. ( Note: we are also intending in future to support MeshCoords with bounds but no points, though that is currently not possible : see https://github.com/SciTools/iris/discussions/4438 ).

However, I guess it is at least arguable that we should be requiring the location + noords coordinates two to have equivalent or compatible metadata , which we could test for -- e.g. what if the face_coords had different units to the node_coords?

It may also be worth considering that MeshCoords don't actually add anything to the cube.location + cube.mesh information : They are a bridging concept providing unstructured coordinate info in a form more like the structured equivalents.

What do you think will be useful ?

Just great to hear that you're using this, though 🚀 ! We don't have so much feedback just yet.

Jul 13 '22 15:07 pp-mo

Thanks for these insights @pp-mo, I can definitely understand the reasoning behind this! However, I still think this is confusing and can be wrong in some (unrealistic, but not completely impossible) cases.

Example

Consider the data below where the node metadata is very different to the face metadata (even the units differ!).

Netcdf file with unstructured grid

netcdf cube {
dimensions:
        Mesh2d_node = 10242 ;
        Mesh2d_face = 20480 ;
        bnds_3 = 3 ;
        Mesh_2d_face_N_nodes = 3 ;
variables:
        int Mesh_2d ;
                Mesh_2d:cf_role = "mesh_topology" ;
                Mesh_2d:topology_dimension = 2 ;
                Mesh_2d:Conventions = "UGRID-1.0" ;
                Mesh_2d:node_coordinates = "nlon nlat" ;
                Mesh_2d:face_coordinates = "flon flat" ;
                Mesh_2d:face_node_connectivity = "mesh2d_face" ;
        double nlon(Mesh2d_node) ;
                nlon:units = "rad" ;
                nlon:standard_name = "grid_longitude" ;
                nlon:long_name = "node longitude" ;
        double nlat(Mesh2d_node) ;
                nlat:units = "rad" ;
                nlat:standard_name = "grid_latitude" ;
                nlat:long_name = "node latitude" ;
        double flon(Mesh2d_face) ;
                flon:bounds = "flon_bnds" ;
                flon:units = "degrees_east" ;
                flon:standard_name = "longitude" ;
                flon:long_name = "face longitude" ;
        double flon_bnds(Mesh2d_face, bnds_3) ;
        double flat(Mesh2d_face) ;
                flat:bounds = "flat_bnds" ;
                flat:units = "degrees_north" ;
                flat:standard_name = "latitude" ;
                flat:long_name = "face latitude" ;
        double flat_bnds(Mesh2d_face, bnds_3) ;
        int mesh2d_face(Mesh2d_face, Mesh_2d_face_N_nodes) ;
                mesh2d_face:cf_role = "face_node_connectivity" ;
                mesh2d_face:start_index = 1 ;
        int64 tas(Mesh2d_face) ;
                tas:standard_name = "air_temperature" ;
                tas:long_name = "Near-surface air temperature" ;
                tas:units = "K" ;
                tas:mesh = "Mesh_2d" ;
                tas:location = "face" ;

// global attributes:
                :Conventions = "CF-1.7" ;
}

Loading this cube gives this:

air_temperature / (K)               (-- : 20480)                                                                                                                                                                                                                              
    Mesh coordinates:                                                                                                                                                                                                                                                         
        grid_latitude                   x
        grid_longitude                  x
    Attributes:
        Conventions                 'CF-1.7'

The first thing you notice is that somehow the coordinates describing the faces have the name grid_latitude and grid_longitude, even though in the netcdf file these coordinates are called latitude and longitude. If we have a closer look at one of those coordinates, we get:

MeshCoord :  grid_latitude / (rad)
    mesh: <Mesh: 'Mesh_2d'>
    location: 'face'
    points: [ 52.61807547,  53.87577873, ..., -46.82864641, -44.85063277]
    bounds: [
        [ 0.92919108,  0.92919108,  0.89622949],
        [ 0.92919108,  0.96233691,  0.92919108],
        ...,
        [-0.8285508 , -0.82920591, -0.79519872],
        [-0.79519872, -0.76070502, -0.79398573]]
    shape: (20480,)  bounds(20480, 3)
    dtype: float64
    standard_name: 'grid_latitude'
    long_name: 'node latitude'
    axis: 'y'

You clearly see that the points and the bounds clearly do not fit together, and the units clearly do not match the point values. This is not only confusing, but also wrong. In addition, the bounds here do not match the ones given in the original file by flat_bnds.

Ideas

One possible idea on how to deal with this could be:

If the mesh is constructed without a face/edge coordinate, the approach that is currently implemented is the way to go. Asking for a MeshCoord(mesh, 'face/edge', 'x/y') should result in a coordinate with empty points and bounds calculated from the connectivity and the node coordinate. In this case, the node metadata correctly describes the face/edge coordinate. However, if I interpret the code correctly, asking for a face/edge MeshCoord when no face/edge coordinate has been used to construct the mesh is not possible at the moment.
If the mesh is constructed with a face/edge coordinate, the situation is more tricky. First, it should be checked if the metadata of the nodes and the face/edge coordinates match (units + standard_name at the very least, I wouldn't check others). This allows us to safely use the more natural face/edge coordinate metadata for the face/edge MeshCoords. One additional problem might arise if the face/edge coordinates already have bounds:

face/edge coordinates have bounds: it might be reasonably to check if these bounds match the ones calculated from the connectivity and the nodes. If they don't match, it might be worthwhile to raise an error here, even though ugrid-checks only prints a warning (WARN A205 : Mesh coordinate variable "clon" within Mesh_2d:face_coordinates has bounds="clon_bnds", which does not match the expected values calculated from the "vlon" node coordinate and the face connectivity, "mesh2d_face".).
face/edge coordinates do not have bounds: Calculate the bounds from the connectivity and the nodes as it is done right now.

One additional thought I have here is that it might be reasonable to perform theses checks already when creating the mesh. That way problems arise very early.

Let me know what you think :rocket:

Jul 14 '22 07:07 schlunma

Could someone with the necessary permissions add the ESMValTool label to this? Thanks!! 👍

Sep 12 '22 09:09 schlunma

Hi @schlunma . Just to say, we're very aware that this is top priority on the ESMValTool Iris project, and I am now looking into it !

I think that pinning down all these decisions may be a bit tricky. So, I will make some initial decisions + implement them in a PR, while still expecting that you may want to request more changes. Hopefully, you can evaluate that before the workshop in 2 weeks' time -- e.g. you can try actual code. Then if it needs more work, we can work it out together.

Oct 03 '22 15:10 pp-mo

Hi @pp-mo, thanks, that sounds great!

I will be travelling this week and next week, so I might be slow to respond, but I will definitely try to test it before the workshop👍

Oct 04 '22 09:10 schlunma

iris iris copied to clipboard

`MeshCoord` constructor always uses metadata of node coordinate regardless of `location`

Example

Ideas

iris
iris copied to clipboard