cf-xarray icon indicating copy to clipboard operation
cf-xarray copied to clipboard

Different recognized coordinates when calling variable by standard name

Open kthyng opened this issue 1 year ago • 13 comments

I know this isn't good form, but I am going to describe my problem to see if anyone has an idea of a direction to go, without a good example to start. I am using several libraries together and making an example case seems difficult.

Here is the base question though: ds['zeta'].cf['longitude'] worked but ds.cf['sea_surface_height_above_mean_sea_level'].cf['longitude'] did not

in other words, when I used the variable name to access a variable in my dataset, cf-xarray knew the mapping for longitude. But, when I referred to the variable by its standard name that cf-xarray recognized, cf-xarray did not then know the mapping for longitude. This seems weird right? Any ideas of what could be wrong?

kthyng avatar Aug 09 '22 20:08 kthyng

If longitude is not a dimension coordinate, I think you'll need longitude in the coordinates attribute of ds.zeta. I bet ds.cf["zeta"].ccf["longitude"] also does not work?

Can you add this to the FAQ if you have time: https://cf-xarray.readthedocs.io/en/latest/faq.html?

dcherian avatar Aug 09 '22 21:08 dcherian

Yes ds.cf["zeta"].cf["longitude"] also did not work.

You should have an autoresponder for my questions that says it is always the coordinates attribute. And I'll try to add to the FAQ tomorrow!

kthyng avatar Aug 09 '22 21:08 kthyng

Should the coordinates attribute trump everything else for interpreting metadata? It is hidden from view (tucked into .encoding) and I didn't know about it until I started using cf-xarray, which is why I always forget about it. I couldn't find it in the docs at all so far, though there are a lot of mentions of "coordinates" so I might have missed it. I find it very tricky whereas interpreting standard_names, units, etc seems much more straight forward.

kthyng avatar Aug 10 '22 16:08 kthyng

To attach coordinate variables when you pull out a dataarray, Xarray checks if coord.dims < var.dims. cf-xarray is more "clever" and parses the coordinates and ancillary_variables attributes. You're right, it would be good to improve the docs on this point.

Comparing popds.cf["UVEL"] and popds["UVEL"] should be useful (popds is in cf_xarray.datasets). The first will have ULAT, ULONG which is what we really want, the second will have ULAT, ULONG, TLAT, TLONG because all of those variables have the same dimensions.

It is hidden from view (tucked into .encoding)

This is where a nice HTML repr for .cf would be awesome. It would just show all CF attributes from both .attrs and .encoding.

dcherian avatar Aug 10 '22 16:08 dcherian

How is the coordinates attribute set originally? Is it subsequently modified? I'm not sure what should be in it for any given variable. For example, zeta in xr.tutorial.open_dataset('ROMS_example.nc') has 'coordinates': 'lon_rho hc h Vtransform lat_rho' though I had thought it would be ocean_time lat_rho lon_rho.

kthyng avatar Aug 10 '22 16:08 kthyng

How is the coordinates attribute set originally? Is it subsequently modified?

It's set in the netcdf dataset. It is never modified. Xarray moves it to eencoding if decode_coords=True in open_*dataset.

That ROMS attribute looks incomplete IMO: hc, h, Vtransform for zeta seems weird but I guess those are needed to calculate it? Including the names of dimensions in coordinates is unnecessary IIRC but it's not wrong to do so.

dcherian avatar Aug 10 '22 17:08 dcherian

I have gotten stuck in the past where the coordinates are either missing or wrong. One time this came up I think was when calculating the z coordinates, which should then be included in the coordinates attributes shouldn't they?

kthyng avatar Aug 10 '22 17:08 kthyng

yes the 4D z variable is a great use for the coordinates attribute.

dcherian avatar Aug 10 '22 17:08 dcherian

I found it too difficult to consistently modify the coordinates attribute to include the new z variables so I removed the coordinates attribute from everything, but I think that is not a reliable fix either. What do you suggest?

kthyng avatar Aug 10 '22 17:08 kthyng

I think the only option is to keep coordinates up-to-date (if you want to propagate that z variable for e.g.).

This might be another use case for #253; automatically update coordinates attribute for various operations.

So ds.sel(latitude=4, drop=True) would remove the name of the latitude variable from coordinates attributes on all DataArrays where applicable.

dcherian avatar Aug 10 '22 17:08 dcherian

Thanks @dcherian! I'll close this but yes I do think updating coordinates attributes would be hugely useful.

kthyng avatar Aug 11 '22 14:08 kthyng

Opening since it would be nice to add this stuff to the documentation.

dcherian avatar Aug 11 '22 15:08 dcherian

Sorry I have to kick this down the road a bit, but it's on my to do list.

kthyng avatar Aug 15 '22 16:08 kthyng