MetPy icon indicating copy to clipboard operation
MetPy copied to clipboard

isentropic_interpolation_as_dataset can create "None" Data Array

Open dopplershift opened this issue 2 years ago • 6 comments

When calling isentropic_interpolation_as_dataset, if you pass in a DataArray that hasn't been part of a Dataset previously, like say the result of a calculation:

isen_level = np.array([320]) * units.kelvin
pressure = ds.Temperature_isobaric.metpy.vertical
mixing = mpcalc.mixing_ratio_from_relative_humidity(pressure, ds.Temperature_isobaric, ds.Relative_humidity_isobaric)
isen_ds = mpcalc.isentropic_interpolation_as_dataset(isen_level, ds.Temperature_isobaric,
                                                     ds['u-component_of_wind_isobaric'],
                                                     ds['v-component_of_wind_isobaric'],
                                                     mixing)

the resulting Dataset looks like:

<xarray.Dataset>
Dimensions:                       (lat: 101, lon: 161)
Coordinates:
    isentropic_level              int64 320
    time                          datetime64[ns] 2022-03-02T06:00:00
    metpy_crs                     object Projection: latitude_longitude
    reftime                       datetime64[ns] 2022-03-02
  * lat                           (lat) float32 60.0 59.5 59.0 ... 10.5 10.0
  * lon                           (lon) float32 230.0 230.5 ... 309.5 310.0
Data variables:
    pressure                      (lat, lon) float32 <Quantity([[245.1992  24...
    temperature                   (lat, lon) float64 <Quantity([[214.15431785...
    u-component_of_wind_isobaric  (lat, lon) float64 <Quantity([[ 5.90835487 ...
    v-component_of_wind_isobaric  (lat, lon) float64 <Quantity([[ 5.18840406 ...
    None                          (lat, lon) float64 <Quantity([[2.30282411e-...

the "workaround" is to assign to mixing a name, like mixing.name = 'mixing_ratio'. Things to consider:

  1. Error (maybe warning) when trying to add an unnamed DataArray to the result
  2. Support **kwargs on isentropic_interpolation_as_dataset so that there's a way to pass in the array with a name to use. Other than the few keyword-only parameters we have (which can't be used as variable names), this would work and be backwards-compatible.

dopplershift avatar Mar 02 '22 05:03 dopplershift

What I have generally done is instead of just assigning the output of the function to a stand alone variable, I add it right to the dataset that is my main data object. For example,

ds['mixing_ratio'] = mpcalc.mixing_ratio_from_relative_humidity(pressure, ds.Temperature_isobaric, ds.Relative_humidity_isobaric)

This is also handy for the declarative syntax plotting since it is based off of the xarray dataset object and is what I generally use of wind speed calculations.

kgoebber avatar Mar 02 '22 20:03 kgoebber

That makes sense, but we definitely need to handle the case where someone doesn't do that better.

dopplershift avatar Mar 02 '22 22:03 dopplershift

We do currently tell people the following

Must have names in order to have a well-formed output Dataset.

in the docs, but that may not be helpful enough on its own. I'd be fine bumping this up to a warning, though could see people being confused. "I used MetPy to make this calculation, why is it yelling at me?"

Could we add names on function output as part of the xarray wrapper?

edit: since this seems to only matter much where we're constructing a new dataset like this, I'm fine going the kwargs route.

dcamron avatar Mar 02 '22 23:03 dcamron

Well, the maintainer of the library missed that this was in the docs, so... (🐑)

dopplershift avatar Mar 03 '22 00:03 dopplershift

I just ran across a similar issue (I think). I was reading 4 variables from a netCDF file with Xarray, all of which I reduce to DataArray objects. I use one of those DataArray's to call height_to_pressure_std(), which returns a DataArray also but the variable has no name. @kgoebber example above works well for Datasets, but in this case I am simply assigning the result to a variable like this: pressure = mpcalc.height_to_pressure_std(HeightDataArray)

I ran into trouble trying to do xr.merge() on the 4 original DataArray's and the 5th from height_to_pressure_std(), at which point Xarray complained and told me the result from height_to_pressure_std() had no name.

I think in this case I could do:

pressure = mpcalc.height_to_pressure_std()
pressure.name = 'pressure'
ds = xr.merge([da1,da2,da3,da4,pressure])

or, as @kgoebber above notes, switch the order of how I do things:

ds = xr.merge([da1,da2,da3,da4])
ds['pressure'] = mpcalc.height_to_pressure_std(HeightDataArray)

but I was wondering if it would be useful to think about building some smarts in for this case where the Xarray pre-processing is able to sense a DataArray is sent in, and if it's returning a DataArray then assign the name attribute to something sensical based upon the calculation being performed (in this case a name of "pressure")? I am a little bit out of my lane here, but wondering if this could be done.

Maybe this is what @dcamron means by "Could we add names on function output as part of the xarray wrapper?" above?

DanielAdriaansen avatar May 06 '22 14:05 DanielAdriaansen

@DanielAdriaansen Yes, I think what you mentioned there is in line with what @dcamron suggested.

I'll add that another consequence of this (just hit in a workshop) is that if you call ds.metpy.parse_cf() on the result from isentropic_interpolation_as_dataset, you end up with a confusing RecursionError. Need to avoid that.

dopplershift avatar Aug 15 '23 18:08 dopplershift