thredds icon indicating copy to clipboard operation
thredds copied to clipboard

Bad units on geopotential height

Open dopplershift opened this issue 7 years ago • 10 comments

At least on 0.25 degree GFS (http://thredds.ucar.edu/thredds/dodsC/grib/NCEP/GFS/Global_0p25deg/Best.html) the units on Geopotential_height_surface and Geopotential_height_isobaric are listed as 'gpm'. There are so many questions that this raises, among them:

  • What does this even stand for? (GeoPotential Meters? In which case ew.)
  • Is this actually udunits compliant?
  • How is that even useful? I'm not going to go find 'gpm' in Wallace&Hobbs.

sigh I don't even...

dopplershift avatar Aug 18 '16 23:08 dopplershift

It's "geopotential meter".

See the bottom of the first page e.g. http://www.ofcm.gov/fmh3/pdf/12-app-d.pdf

Or page 201 of Miller's "Applied Thermodynamics for Meteorologists", https://books.google.com/books?id=mbwsCQAAQBAJ&pg=PA201&lpg=PA201

rschmunk avatar Aug 19 '16 15:08 rschmunk

The problem is that a unit of gpm is not CF-compliant, since it's not a UDUNITS-compatible unit. The closest is gp which is geopotential, with a definition of "gravity". This would mean in UDUNITS parlance, that gpm would be gp * m....m^2 / s^2. This would work for "geopotential", but not "geopotential height".

dopplershift avatar Aug 19 '16 19:08 dopplershift

Just to make the connection between relevant issues, this has been causing problems in MetPy recently: https://github.com/Unidata/MetPy/issues/907.

Would it be worth it resolving it here?

jthielen avatar Aug 02 '18 15:08 jthielen

In this case, the unit attribute is coming from a GRIB parameter table, and that table (generated outside of the TDS) could not care less about UDUNITS / CF/ or the rest of the civilized world, for that matter.

A little background...

When the netCDF-Java library reads in GRIB messages, they are translated into to the Common Data Model (CDM) for representation within netCDF-Java. Whatever a GRIB table uses as "unit" is stuffed into the unit attribute of a given variable.

Gory detail time (here for anyone who might care)

To be specific, for GRIB-2 messages from NCEP (as an example), the code does the following:

  • Look at section 0, octet 7 (7th set of consecutive 8 bit patterns) of the message (read into an integer value) (represents a discipline)
  • Look at section 4, octet 10 of the message (integer value) (represents a category)
  • Look at section 4, octet 11 (integer) (represents a parameter)
  • Use those three integers to look up a value in the parameter table

For NCEP, the parameter table is actually an HTML page that we scrape and turn into something easily read by the code. From what I can tell and have been able to gather, these tables are hand maintained and completely separate from the numerical model code and post-processing routines used to generate the GRIB messages. If the code updates and starts generating values in a different unit, someone would need to update the associated "table" (in this case, the html page online) so that the rest of the world knows about it. Sometimes the HTML will change for minor typo reasons (something didn't get capitalized to someones liking), or maybe the listed unit will get updated (change 'percent' to % because it looks prettier on the web?). However, as far as the GRIB message sitting in a GRIB file on disk is concerned, none of that matters because all that's encode in the actual file are those octets (unlike, say, netCDF). For sure a message might indicate 0-1-0, and the table says "Specific Humidity, kg kg-1", but you never really know with certainty. Also, not all GRIB tables include helpful information like units - sometimes, that's institutional knowledge, and you're lucky to get a 4 character short name associated with the three octets.

TLDR or you just want to keep things PG (no gore):

From a generic standpoint, nothing is wrong so far, other than the fact that someone could list cats per fortnight to the rainbow power as a unit in a GRIB table, written on the back of a napkin, posted as a .png file on Twitter, and that's what we'd have to work with. But mapping GRIB octets to the value of a CDM attribute on a variable? We're good. We're just talking data model mapping here, so whatever GRIB calls a unit (described above for the case of GRIB2 and NCEP...most of the time...), we'll map that to the unit attribute in the CDM.

Is that useful? Maybe?

One option to make the situation a little better is to update the netCDF-Java GRIB code to do the following:

  1. No matter what, take whatever string is populated in the GRIB table, and stuff that into a new attribute called "tableUnit", or something to that effect. tableUnit=cats per fortnight to the rainbow power looks perfectly possible and reasonable to me (when talking about GRIB at least).
  2. Try to clean-up any incoming units read in from GRIB tables, and only use a "clean" unit in the units attribute. If a clean (i.e. UDUNIT compatible) unit cannot be found, then simply leave off the unit attribute.

My only reservation with this approach is that we are applying limitations from the CF conventions as to what a unit is, rather than the more generic guidelines provided by the Best Practices outlined in the netCDF Users Guide. That said, we try to follow CF as best we can when generating netCDF files for return by the TDS, or in how we expose datasets through the CDM and various services in the TDS, so it's probably ok to do it again. One downside - some things that have a unit attribute now would not in the future (but will have the new tableUnit attribute). Any code that's looking for a unit attribute, that also knows how to handle the nastiness of GRIB table units, won't work anymore, as the unit attribute won't exist.

A second option? If unit == gpm then unit = m. I'm almost certain this is wrong.

Also, not 100% sure, but there is probably something in here that describes how we are overloading the term unit in a not-so-good way that will only lead to death and destruction when we talk about geopotential meters as a unit.

lesserwhirls avatar Aug 02 '18 18:08 lesserwhirls

@lesserwhirls Thank you for the in-depth explanation...it makes more sense why things are the way they are now, and also why it doesn't look like there can be an "easy" solution here.

Also, thanks for the NIST reference...I think here are the relevant sections:

7.4 Unacceptability of attaching information to units When one gives the value of a quantity, it is incorrect to attach letters or other symbols to the unit in order to provide information about the quantity or its conditions of measurement. [...]

7.5 Unacceptability of mixing information with units When one gives the value of a quantity, any information concerning the quantity or its conditions of measurement must be presented in such a way as not to be associated with the unit. [...]

jthielen avatar Aug 02 '18 19:08 jthielen

What are your thoughts on the creation of a new tableUnit?

lesserwhirls avatar Aug 02 '18 19:08 lesserwhirls

@lesserwhirls We already haveGrib_* and Grib2_* (and I assume Grib1_*) attributes. For instance,

        Int32 Grib2_Parameter 0, 19, 1;
        String Grib2_Parameter_Discipline "Meteorological products";
        String Grib2_Parameter_Category "Physical atmospheric Properties";
        String Grib2_Parameter_Name "Albedo";

So, how about Grib2_Parameter_Units?

ethanrd avatar Aug 02 '18 19:08 ethanrd

In general, I like the idea of a new, separate attribute because it allows the units attribute to have at least some guarantee of sanity for users and libraries using the data, while still preserving the unit that comes in from the GRIB table. Also, it looks like the Best Practices still say to use UDUNITS conventions where possible for the units attribute.

But, I also share the concern about breaking compatibility for code that relies on the current contents of units. Although, at least in that case the fix should just be a simple rename?

Also, Grib2_Parameter_Units sounds like a good name for the attribute.

jthielen avatar Aug 02 '18 19:08 jthielen

👍 to Grib2_Parameter_Units for consistency with the other Grib2-specific attributes.

While I'm sensitive to the change breaking people, my argument would be:

  1. The current mapping defines :Conventions = "CF-1.6"
  2. Therefore the units attribute needs to be CF-compliant.

dopplershift avatar Aug 02 '18 19:08 dopplershift

See Unidata/thredds#1130

lesserwhirls avatar Aug 02 '18 20:08 lesserwhirls