MHKiT-Python icon indicating copy to clipboard operation
MHKiT-Python copied to clipboard

Xarray conversion - Project Tracking Card

Open akeeste opened this issue 2 years ago • 2 comments
trafficstars

In the near future, MHKiT will begin transitioning from using pandas as its base to xarray. The development team made this decision for several reasons:

  • It is much easier to have functionality in xarray and convert xarray to pandas, than convert pandas to xarray. xarray and pandas can both still be used as function input/output
  • User requests @ryancoe @cmichelenstrofer
  • Increases interoperability with ME Data Pipeline @jmcvey3 and Capytaine @cmichelenstrofer

The below comment is a Project Tracking Card (i.e. roadmap) that records the reasoning for this change and further details the intended steps to complete this process. There is no hard timeline for each step yet, but we're hoping to make significant progress over FY24.

If @ryancoe @cmichelenstrofer @jmcvey3 or others have modules that you would like prioritized, please let us know!

Closing #78 in favor of tracking things here.

akeeste avatar Aug 15 '23 15:08 akeeste

Project Tracking Card

Conversion to Xarray

Target

MHKiT fully supports xarray as its default, base dependency throughout the software package. Functions optionally accept pandas input/output with a new flag, but internal MHKiT functionality uses xarray.

User Story

"As a ___, I want ___, so that ___" As a user, I want new version of xarray, so that MHKiT keeps up to date with other software I use As a user, I want xarray support, so that I can more easily couple MHKiT with other software As a developer, I want an xarray base, because it is easier to convert xarray to pandas than pandas to xarray As a developer, I want an xarray base, to consolidate and standardize the base dependency of MHKiT

Card

  • [x] 1. MHKiT uses a mix of xarray and pandas. The vast majority of functionality is based on pandas and can optionally input/output xarray
  • [x] 2. Update developer expectations and processes so that new functionality is based on xarray.
  • [x] 3. Prioritize modules for xarray conversion based on scale and user needs.
  • [x] 4. 33% of modules' internal functionality is based on xarray
    • Dolfyn module (already in xarray)
    • mooring module (already in xarray)
    • loads module #279
    • power module #282
  • [x] 5. 66% of modules' internal functionality is based on xarray
    • river and tidal modules #285
  • [x] 6. 100% of modules' internal functionality is based on xarray
    • wave module #302, #310
    • utils module
  • [x] 7. MHKiT fully allows for xarray input/output, but function IO defaults are largely unchanged. Functions retain their previous IO standard for pandas/xarray. Begin converting flags from denoting optional xarray to denoting optional pandas.
  • [ ] 8. Future releases could optionally change the behavior of the to_pandas flag, perhaps defaulting to the type (xarray vs pandas) that is input to a function
  • [x] 9. At the "Target" status listed above

akeeste avatar Aug 15 '23 16:08 akeeste

Low priority TODO: WEC-Sim data can be better read using xarray and specifying the DOF as a new dimension instead of tacking it onto variable names

akeeste avatar Apr 17 '24 16:04 akeeste

#352 largely wraps up this thread. Other items related to xarray and type handling will come up in the future, but this effort is largely resolved and does not need to remain open anymore.

akeeste avatar Oct 22 '24 15:10 akeeste