uxarray
uxarray copied to clipboard
Non-Conservative Zonal Mean
Closes #93
Overview
- Implement Non-Conservative Zonal Mean.
- Bug fixes for
_get_zonal_faces_weight_at_constLatalgorithm - Bug fixes for
gca_constlat_intersectionalgorithm - Bug fixes for
_pole_point_inside_polygonalgorithm
Expected Usage
import uxarray as ux
grid_path = "/path/to/grid.nc"
data_path = "/path/to/data.nc"
uxds = ux.open_dataset(grid_path, data_path)
# compute the zonal average for the default start_lat_deg=-90, end_lat_deg=90, step_deg=5
zonal_result = uxds['data_var'].zonal_mean()
# Compute the zonal mean for a single latitude:
zonal_result = uxds["t2m"].zonal_mean(30.0)
PR Checklist
General
- [x] An issue is linked created and linked
- [x] Add appropriate labels
- [x] Filled out Overview and Expected Usage (if applicable) sections
Testing
- [x] Adequate tests are created if there is new functionality
- [x] Tests cover all possible logical paths in your function
- [x] Tests are not too basic (such as simply calling a function and nothing else)
Documentation
- [x] Docstrings have been added to all new functions
- [x] Docstrings have updated with any function changes
- [x] Internal functions have a preceding underscore (
_) and have been added todocs/internal_api/index.rst - [x] User functions have been added to
docs/user_api/index.rst
Examples
- [x] Any new notebook examples added to
docs/examples/folder - [x] Clear the output of all cells before committing
- [x] New notebook files added to
docs/examples.rsttoctree - [ ] New notebook files added to new entry in
docs/gallery.ymlwith appropriate thumbnail photo indocs/_static/thumbnails/
Hi @hongyuchen1030
I've set up the initial boilerplate for your implementation. If you'd like to discuss anything about the implementation before getting started, we can chat about it here.
One question though: will this implementation be the conservative or non-conservative one?
Hi @hongyuchen1030
I've set up the initial boilerplate for your implementation. If you'd like to discuss anything about the implementation before getting started, we can chat about it here.
One question though: will this implementation be the conservative or non-conservative one?
Thank you very much for your great help! I plan to implement the non-conservative one.
And can you also fill up the expected usage for me please? so that I have a better idea to start? Thanks
Hi @hongyuchen1030 I've set up the initial boilerplate for your implementation. If you'd like to discuss anything about the implementation before getting started, we can chat about it here. One question though: will this implementation be the conservative or non-conservative one?
Thank you very much for your great help! I plan to implement the non-conservative one.
And can you also fill up the expected usage for me please? so that I have a better idea to start? Thanks
Should be good now.
I wanted to ask and see how you think this type of functionality will work on data variables mapped to different elements. For example, we want to support data mapped to:
- Nodes
- Edges
- Faces
Does your algorithm work on all these of these?
Hi @hongyuchen1030 I've set up the initial boilerplate for your implementation. If you'd like to discuss anything about the implementation before getting started, we can chat about it here. One question though: will this implementation be the conservative or non-conservative one?
Thank you very much for your great help! I plan to implement the non-conservative one. And can you also fill up the expected usage for me please? so that I have a better idea to start? Thanks
Should be good now.
I wanted to ask and see how you think this type of functionality will work on data variables mapped to different elements. For example, we want to support data mapped to:
- Nodes
- Edges
- Faces
Does your algorithm work on all these of these?
A non-conservative zonal average is an average value on a constant latitude. So I will say it might be mapped to the faces on that latitude.
In your example function call, where is the place that users input latitude for query? And how do users know it's a non-conservative zonal average?
Such query should looks like res = grid.nc_zonal_mean(60, "Temperature")#60 degree for temperature mean
I may have worded the first part of my question incorrectly. I was asking whether your zonal average algorithm can compute the zonal average of data variables that either node, edge, or face centered. Typically the data variables are mapped to the faces, but less commonly data can be stored on each node or edge. For this application thought, I think it should only apply to edge or face centered data, since computing the zonal average of data on each node doesn't make much spatial sense.
For the second part, I can update it to reflect that (I put a very bare-bone outline)
Below is a zonal-average figure, with the righthand plot being the zonal average across the latitudes.
I think our UxDataArray.zonal_average()should compute the zonal average across a discrete range of lattitudes, something like -90 to 90 with a step size of 5 for example. This would generate a plot like the one above.
The example you gave would be computing the zonal average at a specific latitude. This should be a helper function under uxarray.core.zonal and can be called:
# helper function
_zonal_average_at_lattitude(data: np.ndarray, lat: float, conservative: bool = False)
This would contain the bulk of the artihmatic.
This would allow the one implemented at for the UxDataArray to call the helper depending on the user's parameters
# default, returns the zonal average for latitudes in the range [-90, 90] with a step size of 5
uxds['t2m'].zonal_average()
# specify range and step size
uxds['t2m'].zonal_average(lat = (-45, 45, 5))
# compute the zonal average for a single longitude
uxds['t2m'].zonal_average(lat = 45)
# parameter for conservative
uxds['t2m'].zonal_average(conservative=False)
Just a thought, we may want some Grid helper functions for getting the indices of the edges or faces that intersect a latitude line.
Here's a reference from NCL's zonal average page (which is where I got the figure above from) https://www.ncl.ucar.edu/Applications/zonal.shtml
I may have worded the first part of my question incorrectly. I was asking whether your zonal average algorithm can compute the zonal average of data variables that either node, edge, or face centered. Typically the data variables are mapped to the faces, but less commonly data can be stored on each node or edge. For this application thought, I think it should only apply to edge or face centered data, since computing the zonal average of data on each node doesn't make much spatial sense.
For the second part, I can update it to reflect that (I put a very bare-bone outline)
Below is a zonal-average figure, with the righthand plot being the zonal average across the latitudes.
I think our
UxDataArray.zonal_average()should compute the zonal average across a discrete range of lattitudes, something like -90 to 90 with a step size of 5 for example. This would generate a plot like the one above.The example you gave would be computing the zonal average at a specific latitude. This should be a helper function under
uxarray.core.zonaland can be called:# helper function _zonal_average_at_lattitude(lat: float, conservative: bool = False)This would contain the bulk of the artihmatic.
This would allow the one implemented at for the
UxDataArrayto call the helper depending on the user's parameters# default, returns the zonal average for latitudes in the range [-90, 90] with a step size of 5 uxds['t2m'].zonal_average() # specify range and step size uxds['t2m'].zonal_average(lat = (-45, 45, 5)) # compute the zonal average for a single longitude uxds['t2m'].zonal_average(lat = 45) # parameter for conservative uxds['t2m'].zonal_average(conservative=False)Just a thought, we may want some
Gridhelper functions for getting the indices of the edges or faces that intersect a latitude line.
Thank you for your information. I was told that this function will be averaging the data for each face. So I only implemented for the scenario where: a data is attached to each face.
So I expected this current branch to be for a single zonal_average only, the helper function you mentioned. Once I finished this helper function, you can feel free to implement the discrete values function by calling my function.
Thank you for your information. I was told that this function will be averaging the data for each face. So I only implemented for the scenario where: a data is attached to each face.
Got it! Face-centered data is by far the most common scenario anyone will run into, so no worries.
So I expected this current branch to be for a single zonal_average only, the helper function you mentioned. Once I finished this helper function, you can feel free to implement the discrete values function by calling my function.
Sounds good, I can handle the UxDataArray implementation in this PR once you finish up the helper function.
I'll get the boilerplate updated with what we just discussed.
Thank you for your information. I was told that this function will be averaging the data for each face. So I only implemented for the scenario where: a data is attached to each face.
Got it! Face-centered data is by far the most common scenario anyone will run into, so no worries.
So I expected this current branch to be for a single zonal_average only, the helper function you mentioned. Once I finished this helper function, you can feel free to implement the discrete values function by calling my function.
Sounds good, I can handle the
UxDataArrayimplementation in this PR once you finish up the helper function.I'll get the boilerplate updated with what we just discussed.
Great! Thank you very much for your help!
Thank you for your information. I was told that this function will be averaging the data for each face. So I only implemented for the scenario where: a data is attached to each face.
Got it! Face-centered data is by far the most common scenario anyone will run into, so no worries.
So I expected this current branch to be for a single zonal_average only, the helper function you mentioned. Once I finished this helper function, you can feel free to implement the discrete values function by calling my function.
Sounds good, I can handle the
UxDataArrayimplementation in this PR once you finish up the helper function. I'll get the boilerplate updated with what we just discussed.Great! Thank you very much for your help!
Always happy to help!
Check out this pull request on ![]()
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
@hongyuchen1030
As part of this PR, I'll be putting together a user-guide section too.
Hi @philipc2,
I have a question regarding the expected usage of the zonal_mean in the PR description:
zonal_mean is a member function of UxDataArray class,
However, zonal_mean is accessed as UxDataset.zonal_mean in the PR description.
I checked api.py and noticed that it does not have a function similar to open_dataset that returns the UxDataArray class.
Could you please clarify if UxDataArray and UxDataset are both expected to be returned by open_dataset function in the future?
Thank you! --Amber
Hi @philipc2,
I have a question regarding the expected usage of the
zonal_meanin the PR description:
zonal_meanis a member function ofUxDataArrayclass,However,
zonal_meanis accessed asUxDataset.zonal_meanin the PR description.I checked
api.pyand noticed that it does not have a function similar toopen_datasetthat returns theUxDataArrayclass.Could you please clarify if
UxDataArrayandUxDatasetare both expected to be returned byopen_datasetfunction in the future?Thank you! --Amber
Hi @amberchen122
uxds is an instance of a UxDataset, however once we access the data variable of interested, in our case uxds['data_var'], we are now working with a UxDataArray`.
You can think of a UxDataset as a collection of UxDataArrays
I'd suggest giving this section in our user guide a read, it should hopefully clarify it!
Let me know if you have any other questions!
Hi team,
-
I have completed the implementation of the zonal-mean function and its helper functions in
zonal.py. I am now proceeding with testing and documentation. @philipc2, could you please provide any general guidelines for testing and documentation that I should follow? -
I have assumed that the zonal-mean function handles only face-centered data. Could you confirm if this is correct? @philipc2 @hongyuchen1030 Link to code
-
When a
constLatdoes not intersect any face, I have returnedINT_FILL_VALUEas a placeholder value. If there is a preferred placeholder value, please let me know. Link to code -
Additionally, regarding the zonal-mean API, should we allow users to specify the latitude range for which they want to compute the zonal mean?
Thank you for your help! :) --Amber
I attempted to run this on one of our sample meshes, but received an error.
base_path = "../../test/meshfiles/ugrid/outCSne30/"
grid_path = base_path + "outCSne30.ug"
data_path = base_path + "outCSne30_vortex.nc"
uxds = ux.open_dataset(grid_path, data_path)
uxds['psi'].zonal_mean()
ValueError: No intersections are found for the face, please make sure the build_latlon_box generates the correct results
Hi team,
- I have completed the implementation of the zonal-mean function and its helper functions in
zonal.py. I am now proceeding with testing and documentation. @philipc2, could you please provide any general guidelines for testing and documentation that I should follow?- I have assumed that the zonal-mean function handles only face-centered data. Could you confirm if this is correct? @philipc2 @hongyuchen1030 Link to code
- When a
constLatdoes not intersect any face, I have returnedINT_FILL_VALUEas a placeholder value. If there is a preferred placeholder value, please let me know. Link to code- Additionally, regarding the zonal-mean API, should we allow users to specify the latitude range for which they want to compute the zonal mean?
Thank you for your help! :) --Amber
Thank you so much for putting this together and helping out Amber!
- You can refer to our contributors guide some details, however some of it might be a little outdated.
- Yes, having it work for face-centered data only right now is okay
- If there is no intersections found for a single lattitude zonal average, we should have it raise an exception, but if the user is performing the zonal average over a range, any lattitude that doesnt have any intersetions should have its value set to
np.nan. For example, if we perform the zonal average of lattitudes of [-10, 0, 10] and 0 has no intersections, the result should look something like [0.13, np.nan, 0.33] - See my comment above
For documentation, I have already set up a notebook for a user guide section that you can fill out.
uxarray/docs/user-guide/zonal-mean.ipynb
You should also include any functions in both our User and Internal API under uxarray/docs/internal_api.index.rst and uxarray/docs/user_api.index.rst
You can use one of my work in progress sections as a reference if you'd like.
https://uxarray--711.org.readthedocs.build/en/711/user-guide/topological-aggregations.html
I attempted to run this on one of our sample meshes, but received an error.
base_path = "../../test/meshfiles/ugrid/outCSne30/" grid_path = base_path + "outCSne30.ug" data_path = base_path + "outCSne30_vortex.nc" uxds = ux.open_dataset(grid_path, data_path) uxds['psi'].zonal_mean()ValueError: No intersections are found for the face, please make sure the build_latlon_box generates the correct results
There seems to be an error in the bounds computation.
Here is a single face from that mesh:
These are the bounds generated by uxgrid.bounds
Compare this to what is generated with GeoPandas:
Any Thoughts? It looks to be offset by 360, with everything else being correct.
Example to Reproduce
import uxarray as ux
import numpy as np
import holoviews as hv
base_path = "../uxarray/test/meshfiles/ugrid/outCSne30/"
grid_path = base_path + "outCSne30.ug"
data_path = base_path + "outCSne30_vortex.nc"
uxds = ux.open_dataset(grid_path, data_path)
gdf = uxds.uxgrid.to_geodataframe()
ggdf = gdf.to_geopandas()
first_face_bound_ux = uxds.uxgrid.bounds.values[0]
np.rad2deg(first_face_bound_ux)
first_face_bound_gp = ggdf.bounds[0:1]
hv.Polygons(ggdf[0:1])
I attempted to run this on one of our sample meshes, but received an error.
base_path = "../../test/meshfiles/ugrid/outCSne30/" grid_path = base_path + "outCSne30.ug" data_path = base_path + "outCSne30_vortex.nc" uxds = ux.open_dataset(grid_path, data_path) uxds['psi'].zonal_mean()ValueError: No intersections are found for the face, please make sure the build_latlon_box generates the correct results
There seems to be an error in the bounds computation.
Here is a single face from that mesh:
These are the bounds generated by
uxgrid.boundsCompare this to what is generated with GeoPandas:
Any Thoughts? It looks to be offset by 360, with everything else being correct.
Example to Reproduce
import uxarray as ux import numpy as np import holoviews as hv base_path = "../uxarray/test/meshfiles/ugrid/outCSne30/" grid_path = base_path + "outCSne30.ug" data_path = base_path + "outCSne30_vortex.nc" uxds = ux.open_dataset(grid_path, data_path) gdf = uxds.uxgrid.to_geodataframe() ggdf = gdf.to_geopandas() first_face_bound_ux = uxds.uxgrid.bounds.values[0] np.rad2deg(first_face_bound_ux) first_face_bound_gp = ggdf.bounds[0:1] hv.Polygons(ggdf[0:1])
Are these latlon faces? Latlon faces means all edges are either latitude line or longitude lines.
In the bound function, if you didn't pass grid._populate_bounds(is_latlon = True) it will treat every edge as a great circle arc. And great circle arcs will arc up: The max and min latitude is not at the endpoint
There are two test helpers you can utilize to see if that's the problem The following function is a little helper function: it uses an iterative method to find the maximum latitude as the ground true. https://github.com/UXARRAY/uxarray/blob/386e92df3e3a3d436d325c2b7a6ad5f3560c8662/test/test_geometry.py#L166
But to me, the only thing that is off is latitude max:-32.48416(our) and their results:-32.484165. Which seems a lot like a precision issue here.
There used to be several test cases in which I ran your mentioned mesh files, and it produced the same bounds as my iterative method(guaranteed to be correct). However, it seems they're removed now for some reason.
And for the longitude range. The inconsistency in the longitude expression might cause issues, as what happened in the previous coordinates conversion vectorization PR.
Updated Documentation and Tests:
- User and internal API documentation
- uxarray/docs/internal_api.index.rst
- uxarray/docs/user_api.index.rst
- User guide
- uxarray/docs/user-guide/zonal-mean.ipynb
- Enable user-defined latitude range
- Unit test coverage for the internal APIs, where the
_get_zonal_faces_weight_at_constLatfunction is simulated in the test code.
I am also encountering a ValueError: No intersections are found for the face, please make sure the build_latlon_box generates the correct results error in the tests. Due to this issue, I am currently bypassing the _get_zonal_faces_weight_at_constLat function, as it seems to be the source of the problem.
Should I go ahead and try to figure out why build_latlon_box is generating incorrect bounds?
Besides that, all the internal APIs seem to be working correctly; I will implement an integrating test after the bound error gets resolved.
Updated Documentation and Tests:
User and internal API documentation
- uxarray/docs/internal_api.index.rst
- uxarray/docs/user_api.index.rst
User guide
- uxarray/docs/user-guide/zonal-mean.ipynb
Enable user-defined latitude range
Unit test coverage for the internal APIs, where the
_get_zonal_faces_weight_at_constLatfunction is simulated in the test code.I am also encountering a
ValueError: No intersections are found for the face, please make sure the build_latlon_box generates the correct resultserror in the tests. Due to this issue, I am currently bypassing the_get_zonal_faces_weight_at_constLatfunction, as it seems to be the source of the problem.Should I go ahead and try to figure out why build_latlon_box is generating incorrect bounds?
Besides that, all the internal APIs seem to be working correctly; I will implement an integrating test after the bound error gets resolved.
Thank you very much for your work! If you get the ValueError: No intersections are found for the face, please make sure the build_latlon_box generates the correct results after you update the edge_nodes_cart helper function, then the problem might be in the new edge_nodes_cart implementation.
base_path = "../../test/meshfiles/ugrid/outCSne30/" grid_path = base_path + "outCSne30.ug" data_path = base_path + "outCSne30_vortex.nc" uxds = ux.open_dataset(grid_path, data_path) uxds['psi'].zonal_mean()
I was trying to reproduce this, but I could not load the data files. Do you know why this is happening @philipc2 ?
base_path = "../../test/meshfiles/ugrid/outCSne30/" grid_path = base_path + "outCSne30.ug" data_path = base_path + "outCSne30_vortex.nc" uxds = ux.open_dataset(grid_path, data_path) uxds['psi'].zonal_mean()I was trying to reproduce this, but I could not load the data files. Do you know why this is happening @philipc2 ?
It's most likely due to the path being invalid. I'd double check the base path.
@philipc2
test/test_api.py::TestAPI::test_open_mf_dataset is failing after I added test/meshfiles/ugrid/outCSne30/outCSne30_test2.nc
I think the reason is because https://github.com/UXARRAY/uxarray/blob/00a4988513ceb816baa3890449726ac4681db653/test/test_api.py#L27C5-L28C68
this hardcoded file path incorrectly matched outCSne30_test2.nc.
Can you please look into it? I opened an issue #860
@philipc2
test/test_api.py::TestAPI::test_open_mf_datasetis failing after I addedtest/meshfiles/ugrid/outCSne30/outCSne30_test2.ncI think the reason is because https://github.com/UXARRAY/uxarray/blob/00a4988513ceb816baa3890449726ac4681db653/test/test_api.py#L27C5-L28C68 this hardcoded file path incorrectly matched
outCSne30_test2.nc.Can you please look into it? I opened an issue #860
Please see our comments in that issue
In the process of review this now, I'll also add a simple ASV benchmark that we can run through this PR.
Here's an example I was able to generate
import matplotlib.pyplot as plt
from matplotlib import gridspec
import uxarray as ux
base_path = "../../test/meshfiles/ugrid/outCSne30/"
grid_path = base_path + "outCSne30.ug"
data_path = base_path + "outCSne30_vortex.nc"
uxds = ux.open_dataset(grid_path, data_path)
zonal_avg_psi = uxds['psi'].zonal_mean()
# create a figure
fig = plt.figure()
# to change size of subplot's
# set height of each subplot as 8
fig.set_figheight(5)
# set width of each subplot as 8
fig.set_figwidth(15)
# create grid for different subplots
spec = gridspec.GridSpec(ncols=2, nrows=1,
width_ratios=[5, 1], wspace=0.1)
# Global Plot
ax0 = fig.add_subplot(spec[0])
ax0.set_xlim((-180, 180))
ax0.set_ylim((-90, 90))
pc = uxds["psi"].to_polycollection(periodic_elements='exclude')
pc.set_cmap("plasma")
ax0.add_collection(pc)
cbar = fig.colorbar(pc, ax=ax0, orientation='vertical', location='left', shrink=0.8)
cbar.set_label('Colorbar Label')
ax0.set_title("PSI")
# Zonal Plot
ax1 = fig.add_subplot(spec[1])
ax1.set_ylim((-90, 90))
ax1.set_xlim((-1.5, 1.5))
ax1.grid()
ax1.set_title("PSI Zonal Mean")
ax1.plot(zonal_avg_psi.values, zonal_avg_psi.latitude_deg.values, lw=2)



Any Thoughts? It looks to be offset by 360, with everything else being correct.
