tobac icon indicating copy to clipboard operation
tobac copied to clipboard

`iris` <-> `xarray` conversion issues

Open w-k-jones opened this issue 1 year ago • 4 comments

Keeping track of issues encountering when converting between iris Cubes and xarray DataArrays:

  • Round trip conversion (xarray -> iris -> xarray) of integer DataArrays causes either a TypeError for xarray < v2023.06 or RuntimeWarning for xarray >= v2023.06. This is due to the core data being converted to a masked array when converting to an iris Cube, and then xarray trying to fill said array with np.nan when converting back
    • Workaround: copy cube with array-like core_data when converting from xarray to iris: cube = da.to_iris().copy(da.data)
  • dims without coords get converted to default unnamed dim names during round trip conversion (xarray -> iris -> xarray)
    • e.g. ("x", "y") -> ("dim_0", "dim_1")
    • Need to save origin dim names and remap back after output
  • xarray uses coord var_name, whereas iris uses standard_name
    • Need to map between the two to match input standard for pandas dataframe output column names
    • standard_name is not necessarily unique, which can cause problems in tobac. e.g. GOES ABI data
  • cftime 😡
    • Conversion of Dataframe cftime column to np.datetime64: xr.CFTimeIndex(features["time"].to_numpy()).to_datetimeindex()
  • iris cannot handle sub-second (e.g. ns) time values
    • Need to convert times to np.datetime64[s] before converting to iris
    • Need to map back to origin time values in output DataArrays/Dataframes to ensure functions like bulk_statistics work correctly

Please add any more issues that crop up

w-k-jones avatar Dec 04 '23 11:12 w-k-jones

For #354, I'm working on a modification to the decorators that passes an optional hidden kwarg that notes whether an iris->xarray conversion occurred, to hopefully help with some of these issues.

freemansw1 avatar Dec 04 '23 15:12 freemansw1

Sounds like a good plan. Would it make sense to split the decorator changes into a separate branch/PR so they can be worked on concurrently with converting other sections of the code to xarray?

Also most of the conversion issues seem to affect converting from xarray to iris and then back again, so long term hopefully these should be less of an issue once the internals are switched to xarray

w-k-jones avatar Dec 04 '23 15:12 w-k-jones

Sounds like a good plan. Would it make sense to split the decorator changes into a separate branch/PR so they can be worked on concurrently with converting other sections of the code to xarray?

I can do that once it's in a working condition.

freemansw1 avatar Dec 04 '23 17:12 freemansw1

Sounds like a good plan. Would it make sense to split the decorator changes into a separate branch/PR so they can be worked on concurrently with converting other sections of the code to xarray?

I can do that once it's in a working condition.

This is now available in #380

freemansw1 avatar Dec 05 '23 15:12 freemansw1