tobac icon indicating copy to clipboard operation
tobac copied to clipboard

Switch segmentation to use xarray internally

Open freemansw1 opened this issue 11 months ago • 12 comments

In line with #354, this PR switches segmentation to use xarray internally and also does some light refactoring of the segmentation package.

This PR includes #354 so that I could make sure that it all worked together. #354 should be merged first, so I am marking this as a draft PR until that is merged.

  • [ ] Have you followed our guidelines in CONTRIBUTING.md?
  • [ ] Have you self-reviewed your code and corrected any misspellings?
  • [ ] Have you written documentation that is easy to understand?
  • [ ] Have you written descriptive commit messages?
  • [ ] Have you added NumPy docstrings for newly added functions?
  • [ ] Have you formatted your code using black?
  • [ ] If you have introduced a new functionality, have you added adequate unit tests?
  • [ ] Have all tests passed in your local clone?
  • [ ] If you have introduced a new functionality, have you added an example notebook?
  • [ ] Have you kept your pull request small and limited so that it is easy to review?
  • [ ] Have the newest changes from this branch been merged?

freemansw1 avatar Mar 13 '24 15:03 freemansw1

Linting results by Pylint:

Your code has been rated at 8.72/10 (previous run: 8.72/10, +0.00) The linting score is an indicator that reflects how well your code version follows Pylint’s coding standards and quality metrics with respect to the RC_v1.6.0 branch. A decrease usually indicates your new code does not fully meet style guidelines or has potential errors.

github-actions[bot] avatar Mar 13 '24 15:03 github-actions[bot]

The codecov results indicate to me that we will need to specifically exclude the versions of xarray/iris that cause a conversion incompatibility when we release 1.6.0.

freemansw1 avatar Mar 13 '24 15:03 freemansw1

Awesome work, let me know if you would like me to start reviewing :)

w-k-jones avatar Mar 13 '24 17:03 w-k-jones

Note: 360 day calendar issues and times not matching up between feature detection and segmentation

freemansw1 avatar Apr 12 '24 14:04 freemansw1

@freemansw1, was this already ready for a re-review (maybe after #354 is merged) or not quite yet?

I can have a look at what is going on with the Jupyter CI since it seems to be caused by the bulk statistics. For some reason, the iris_to_xarray decorator breaks here when get_statistics_from_mask() is called:

https://github.com/tobac-project/tobac/blob/201c6015d0237303c4ede5db3f995b115b8be9bb/tobac/utils/bulk_statistics.py#L214-L217

https://github.com/tobac-project/tobac/blob/201c6015d0237303c4ede5db3f995b115b8be9bb/tobac/utils/decorators.py#L178

We do not use the decorator when the bulk statistics are computed within the feature detection/ segmentation, so the offline computation is the only notebook that fails. It is odd, though, because the input format looks the same as in the working version and the unit test for that function also passes. I will look more into this.

JuliaKukulies avatar Jul 23 '24 00:07 JuliaKukulies

Thanks, @JuliaKukulies . I'm hesitant to mark this ready for review until #354 is merged, but happy for you to look into the bulk statistics issue.

freemansw1 avatar Jul 23 '24 13:07 freemansw1

OK, I'm working on more thoroughly testing this now that #354 is merged. One issue that I'm consistently noticing is issues between nanosecond/microsecond conversions. This is a note to myself to add the ability to find segmentation that is padded or some other approach to this.

freemansw1 avatar Aug 30 '24 19:08 freemansw1