tobac
tobac copied to clipboard
Switch segmentation to use xarray internally
In line with #354, this PR switches segmentation to use xarray internally and also does some light refactoring of the segmentation package.
This PR includes #354 so that I could make sure that it all worked together. #354 should be merged first, so I am marking this as a draft PR until that is merged.
- [ ] Have you followed our guidelines in CONTRIBUTING.md?
- [ ] Have you self-reviewed your code and corrected any misspellings?
- [ ] Have you written documentation that is easy to understand?
- [ ] Have you written descriptive commit messages?
- [ ] Have you added NumPy docstrings for newly added functions?
- [ ] Have you formatted your code using black?
- [ ] If you have introduced a new functionality, have you added adequate unit tests?
- [ ] Have all tests passed in your local clone?
- [ ] If you have introduced a new functionality, have you added an example notebook?
- [ ] Have you kept your pull request small and limited so that it is easy to review?
- [ ] Have the newest changes from this branch been merged?
Linting results by Pylint:
Your code has been rated at 8.72/10 (previous run: 8.72/10, +0.00) The linting score is an indicator that reflects how well your code version follows Pylint’s coding standards and quality metrics with respect to the RC_v1.6.0 branch. A decrease usually indicates your new code does not fully meet style guidelines or has potential errors.
The codecov results indicate to me that we will need to specifically exclude the versions of xarray/iris that cause a conversion incompatibility when we release 1.6.0.
Awesome work, let me know if you would like me to start reviewing :)
Note: 360 day calendar issues and times not matching up between feature detection and segmentation
@freemansw1, was this already ready for a re-review (maybe after #354 is merged) or not quite yet?
I can have a look at what is going on with the Jupyter CI since it seems to be caused by the bulk statistics. For some reason, the iris_to_xarray
decorator breaks here when get_statistics_from_mask()
is called:
https://github.com/tobac-project/tobac/blob/201c6015d0237303c4ede5db3f995b115b8be9bb/tobac/utils/bulk_statistics.py#L214-L217
https://github.com/tobac-project/tobac/blob/201c6015d0237303c4ede5db3f995b115b8be9bb/tobac/utils/decorators.py#L178
We do not use the decorator when the bulk statistics are computed within the feature detection/ segmentation, so the offline computation is the only notebook that fails. It is odd, though, because the input format looks the same as in the working version and the unit test for that function also passes. I will look more into this.
Thanks, @JuliaKukulies . I'm hesitant to mark this ready for review until #354 is merged, but happy for you to look into the bulk statistics issue.
OK, I'm working on more thoroughly testing this now that #354 is merged. One issue that I'm consistently noticing is issues between nanosecond/microsecond conversions. This is a note to myself to add the ability to find segmentation that is padded or some other approach to this.