xarray quantify accessor doesn't return a Quantity for a coordinate variable
What went wrong?
I read an xarray dataset, with units correctly attached. The metpy quantify accessor stripped units but did not return a Pint quantity. An extract of sample data, which is derived from a DOE ASR interpolated radiosonde dataset, is attached.
xarray = 2023.8.0, numpy = 1.25.2, pint = 0.22 height_test_data.nc.zip
Operating System
MacOS
Version
1.5.1
Python Version
3.10.12
Code to Reproduce
import xarray as xr
import metpy.units as units
ds = xr.open_dataset('height_test_data.nc')
height_q = ds.height.metpy.quantify()
height_q_good = ds.height * units.units(ds.height.attrs['units'])
print(height_q.metpy.units) # dimensionless
print(height_q_good.metpy.units) # kilometers
Errors, Traceback, and Logs
No response
This is an unfortunate corner case in the xarray units handling, as height in this dataset is a coordinate variable, so MetPy can't convert the corresponding index to a Pint object (support for a PintIndex appears to be a still ongoing effort in pint-xarray and upstream in xarray). Since MetPy's .quantify relies upon DataArray.copy, the index and data variable remain linked, so quantification fails. Adding the units manually with multiplication happens to separate the data variable from the index (note that height_q_good.data and height_q_good.height.data differ).
Are we okay with MetPy's current behavior in this case, or should we include some kind of workaround that dissociates the data variable from the coordinate index?
I think it's essential from a user point of view to handle or the coordinate data … my use case was, for instance, running it through the ECAPE calculation, which requires height. If I were a student and it failed like this, there'd be no way to even know what went wrong.
It was my understanding that with the index refactor in xarray that we should be able to make this happen now? I haven't tried recently, but I think something basic would work?
Stretch goal with adding such an index inside MetPy would be that it would obviate the need for our own sel, etc. to handle units.
It was my understanding that with the index refactor in xarray that we should be able to make this happen now? I haven't tried recently, but I think something basic would work?
I unfortunately don't think things are quite ready yet for what we would need (getting very close though):
- https://github.com/pydata/xarray/pull/8124
- https://github.com/xarray-contrib/pint-xarray/pull/163