Adds option to output regridding weights
Description
Adds output_weights option to regrid2 and xesmf regridders.
Can be set to true/false to include weights as weights variable in output dataset.
If it's a str, this will be the name of the variable in the output dataset.
The regrid2 regridder formats it's weights as a sparse array with dimensions (out_y, out_x, in_y, in_x).
This matches the behavior of the xesmf regridder. Sparse array is used for space efficiency.
- Closes #550
Checklist
- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my own code
- [ ] My changes generate no new warnings
- [ ] Any dependent changes have been merged and published in downstream modules
If applicable:
- [ ] I have added tests that prove my fix is effective or that my feature works
- [ ] New and existing unit tests pass with my changes (locally and CI/CD build)
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [ ] I have noted that this is a breaking change for a major release (fix or feature that would cause existing functionality to not work as expected)
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 100.00%. Comparing base (
aaf9ec3) to head (b7f8795). Report is 1 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #752 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 16 16
Lines 1722 1765 +43
=========================================
+ Hits 1722 1765 +43
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
@pochedls @lee1043 @chengzhuzhang @tomvothecoder I wanted to add one more feature to this PR but want to get a consensus first. I want to update the grid functions to not just output a dataset with spatial coords/dimensions but also include a mask variable. I'm curious if this makes sense from a user perspective.
@pochedls @lee1043 @chengzhuzhang @tomvothecoder I wanted to add one more feature to this PR but want to get a consensus first. I want to update the grid functions to not just output a dataset with spatial coords/dimensions but also include a
maskvariable. I'm curious if this makes sense from a user perspective.
I don't see a near-term need for this (for myself) if the regridder is automatically NaN-ing out missing data in a way that is reasonable. But if the mask is useful for interpreting the regridder behavior and it is easy to include it as an output, I'm good with this.
I think this function is useful to have for testing purpose. With the function it would be easier to compare weightings of different regridders. This might be also helpful for high resolution data where regridding can take longer than coarser grid. A following question would be, can the regridder accept re-use of pre-calculated weights? I support this addition unless it significantly drop the performance (which I think it is very unlikely). Thank you for the addition, @jasonb5!
I forgot to add, make sure to update pyproject.toml dependencies too:
https://github.com/xCDAT/xcdat/blob/4f6313ed7bba641d8b32e5ec5f3aedf0009e558b/pyproject.toml#L25-L36
I'm curious, what are some scenarios where pre-calculated weights might be used/preferred over internally generated weights?
@pochedls and @chengzhuzhang can weigh in on the usefulness of this enhancement.
I don't know if I would use this frequently, but I could see it being useful and a reasonable feature to include. I think if this is a matter of just adding this as an input/output, then it isn't a big deal to add, but if it requires more work it might make sense to tackle this on a longer time horizon with a specific issue/PR.
I'm curious, what are some scenarios where pre-calculated weights might be used/preferred over internally generated weights? @pochedls and @chengzhuzhang can weigh in on the usefulness of this enhancement.
I don't know if I would use this frequently, but I could see it being useful and a reasonable feature to include. I think if this is a matter of just adding this as an input/output, then it isn't a big deal to add, but if it requires more work it might make sense to tackle this on a longer time horizon with a specific issue/PR.
Jill mentioned in our 1:1 meeting that it might be performant to re-use the internally generated weights (outputted via this PR) for variables on the same grid.
I think xCDAT's xESMF regridder already supports this feature by inheriting xESMF's weights parameter to use custom weights (link). I have not tested this yet.
I suggest we open a separate GitHub issue and PR to tackle this enhancement for Regrid2 if we think it will be useful.
I'm going to go ahead and add the mask variable to the grid dataset returned from ds.regridder.grid property. This will simplify a few lines of code in this PR.
As for re-using weights as Tom mentioned xESMF already supports this and it would be easy to add for Regrid2. I can create an new issue to discuss this.
I'm going to go ahead and add the
maskvariable to the grid dataset returned fromds.regridder.gridproperty. This will simplify a few lines of code in this PR.As for re-using weights as Tom mentioned xESMF already supports this and it would be easy to add for Regrid2. I can create an new issue to discuss this.
Thanks @jasonb5. This PR looks like it'll be good to go after you add the mask variable with tests.
Once you merge, I can proceed with releasing v0.9.0.
@tomvothecoder Is it fine to merge now?
Hey @jasonb5, I had some review comments. I think we need to add the same
create_nan_masklogic in the Regrid2 code to the xESMF regridder too. That way xESMF can use the correct "mask" variable from the source variable for more accurate regridding. Right now it just transposes any existing "mask" variable.Once this is implemented with tests, you can merge.
This is fine, I can add it now. Didn't know if we wanted to address this in a later PR.
@tomvothecoder This is done once the CI passes.
I rebased this branch on main, fixed docstrings and type annotations, and added a suggestion from GH Copilot (here).
I will merge now. Thanks for the work, @jasonb5!