datatree icon indicating copy to clipboard operation
datatree copied to clipboard

Implement dask-specific methods

Open darothen opened this issue 2 years ago • 5 comments

This is an initial implementation of the feature requested in #97.

The first implementation here very closely follows the implementation of these methods by xarray.Dataset. For the majority of the methods, this should work fine; we iterate over all the nodes in our tree, starting at the root, and perform the necessary dask.collections API operation. However, __dask_post{compute,persist}__ is a bit more complicated; some additional testing is required to ensure that we're appropriately applying the available support utilities to re-construct our final DataTree without any superfluous work.

  • [x] Closes #97
  • [ ] Tests added
  • [ ] Passes pre-commit run --all-files
  • [ ] New functions/methods are listed in api.rst
  • [ ] Changes are summarized in docs/source/whats-new.rst

darothen avatar Jan 13 '23 23:01 darothen

Tag @TomNicholas, will work on testing this over the coming days as I have time.

darothen avatar Jan 13 '23 23:01 darothen

Here's a gist based on @jbusecke's CMIP6 demo showing the top-level integration of load and compute (you can just easily modify it to show that persist works.

Still left to do are writing some test cases and further deep-diving to make sure that the dask collections API functions we provided here are used.

darothen avatar Jan 14 '23 20:01 darothen

Thanks for the quick review @TomNicholas, hoping to address later today or tomorrow. Note on the line repetition - looks like I screwed up a merge somewhere, will need to fix that separately.

darothen avatar Jan 15 '23 18:01 darothen

@darothen wondering if you had any time soon to revisit this PR? Would be great to get it in soon because Julius and I are writing another blog post about using datatree with dask on CMIP6 data.

TomNicholas avatar Mar 06 '23 20:03 TomNicholas

@TomNicholas I'm hacking on some projects this weekend, let me see if I can wrap things up. Apologies for the delay... it became very hectic at work shortly after the hackathon and I haven't had much time for side projects.

darothen avatar Mar 25 '23 20:03 darothen