uxarray
uxarray copied to clipboard
Optimize the `get_cartesian_face_edge_nodes` and `get_lonlat_rad_face_edge_nodes`
Related to #730 and #785
Overview
- Optimize the
get_cartesian_face_edge_nodesandget_lonlat_rad_face_edge_nodesfunctions to make them vectorized and improve performance. - This function was called in
grid_populate_boundsand will be called inzonal_mean().
Expected Usage
import uxarray as ux
import numpy as np
# Define vertices
vertices = [[0.5, 0.5, 0.5], [-0.5, 0.5, 0.5], [-0.5, -0.5, 0.5], [0.5, -0.5, 0.5]]
# Normalize the vertices
vertices = [x / np.linalg.norm(x) for x in vertices]
# Construct the grid from the vertices
grid = ux.Grid.from_face_vertices(vertices, latlon=False)
# Compute Cartesian face edge nodes
face_edges_connectivity_cartesian = _get_cartesian_face_edge_nodes(
grid.face_node_connectivity.values,
grid.node_x.values,
grid.node_y.values, grid.node_z.values
)
PR Checklist
General
- [x] An issue is linked created and linked
- [x] Add appropriate labels
- [x] Filled out Overview and Expected Usage (if applicable) sections
Testing
- [x] Adequate tests are created if there is new functionality
- [x] Tests cover all possible logical paths in your function
- [x] Tests are not too basic (such as simply calling a function and nothing else)
Documentation
- [x] Docstrings have been added to all new functions
- [x] Docstrings have updated with any function changes
- [x] Internal functions have a preceding underscore (
_) and have been added todocs/internal_api/index.rst - [x] User functions have been added to
docs/user_api/index.rst
Performance Benchmarks
cd benchmarks asv continuous 93cf4d9fd6d29fa8ed26d527ebe1724264da86d8 386e92df3e3a3d436d325c2b7a6ad5f3560c8662 --bench face_bounds.FaceBounds.time_face_boundsBefore
### After
### Comparison Looks like a solid ~1.6x increase in performance for the `geoflow-small` grid.
Interesting, the test is failing for the `meshfiles/mpas/QU/oQU480.231010.nc' grid now with the following error:
File "/Users/philipc/PycharmProjects/uxarray-devel/uxarray/benchmarks/env/6c8209bc6e474780e9630a614b995a74/lib/python3.11/site-packages/uxarray/grid/utils.py", line 280, in _get_cartesian_face_edge_nodes return np.array(faces_edges_coordinates) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (1791,) + inhomogeneous part.
The
oQU480grid is an MPAS ocean mesh, with no faces/elements over continents .
Can you look into this mesh and see if you can reproduce the error?
Checking on it now
Performance Benchmarks
cd benchmarks asv continuous 93cf4d9fd6d29fa8ed26d527ebe1724264da86d8 386e92df3e3a3d436d325c2b7a6ad5f3560c8662 --bench face_bounds.FaceBounds.time_face_boundsBefore
### After
### Comparison Looks like a solid ~1.6x increase in performance for the `geoflow-small` grid.
Interesting, the test is failing for the `meshfiles/mpas/QU/oQU480.231010.nc' grid now with the following error:
File "/Users/philipc/PycharmProjects/uxarray-devel/uxarray/benchmarks/env/6c8209bc6e474780e9630a614b995a74/lib/python3.11/site-packages/uxarray/grid/utils.py", line 280, in _get_cartesian_face_edge_nodes return np.array(faces_edges_coordinates) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (1791,) + inhomogeneous part.
The
oQU480grid is an MPAS ocean mesh, with no faces/elements over continents .
Can you look into this mesh and see if you can reproduce the error?
Hi @philipc2 ,
I have fixed the MPAS grid and included that in the test case as well.
I noticed that the grid.node_x,grid.node_y,grid.node_z when reading in the MPAS data, are all unnormalized, is that what we expect?
I noticed that the grid.node_x,grid.node_y,grid.node_z when reading in the MPAS data, are all unnormalized, is that what we expect?
Yeah, we parse these directly from the MPAS grid file and don't modify them.
Can you try out something like this?
edge_node_connectivity = uxgrid.edge_node_connectivity.values
node_lon = uxgrid.node_lon.values
node_lat = uxgrid.node_lat.values
# Longitudes of the nodes that make up each edge in radians: (n_edge, 2)
edge_nodes_lon_rad = np.deg2rad(node_lon[edge_node_connectivity])
# Lattitudes of the nodes that make up each edge in radians: (n_edge, 2)
edge_nodes_lat_rad = np.deg2rad(node_lat[edge_node_connectivity])
# LonLat pairs for the nodes that make up each edge in radians: (n_edge, 2, 2)
edge_nodes_lonlat_rad = np.array([edge_nodes_lon_rad, edge_nodes_lat_rad]).swapaxes(0, 1)
face_edge_connectivity = uxgrid.face_edge_connectivity.values
# obtain a 1D view
face_edge_connectivity_raveled = face_edge_connectivity.ravel()
non_fill_value_mask = face_edge_connectivity_raveled != INT_FILL_VALUE
face_edges_lonlat_rad = np.full((uxgrid.n_face * uxgrid.n_max_face_edges, 2, 2), INT_FILL_VALUE)
face_edges_lonlat_rad[non_fill_value_mask] = edge_nodes_lonlat_rad[face_edge_connectivity_raveled[non_fill_value_mask]]
face_edges_lonlat_rad = face_edges_lonlat_rad.reshape((uxgrid.n_face, uxgrid.n_max_face_edges, 2, 2))
The result array face_edges_lonlat_rad contains the LonLat pairs for the edges that make up every face. This implementation here is completely vectorized.
Let me know if you need me to clarify anything.
Assuming that face_edge_connectivity and edge_node_connectivity are already constructed, this executes in about 200ms on my laptop for the MPAS mesh that I described above.
A slightly cleaned up version below.
edge_node_connectivity = uxgrid.edge_node_connectivity.values
node_lon = uxgrid.node_lon.values
node_lat = uxgrid.node_lat.values
face_edge_connectivity = uxgrid.face_edge_connectivity.values
n_face = uxgrid.n_face
n_max_face_edges = uxgrid.n_max_face_edges
def foo(node_lon, node_lat, edge_node_connectivity, face_edge_connectivity, n_face, n_max_face_edges):
edge_nodes_lon_rad = np.deg2rad(node_lon[edge_node_connectivity])
# (n_edge, 2)
edge_nodes_lat_rad = np.deg2rad(node_lat[edge_node_connectivity])
# (n_edge, 2, 2)
edge_nodes_lonlat_rad = np.array([edge_nodes_lon_rad, edge_nodes_lat_rad]).swapaxes(0, 1)
face_edge_connectivity_raveled = face_edge_connectivity.ravel()
non_fill_value_mask = face_edge_connectivity_raveled != INT_FILL_VALUE
face_edges_lonlat_rad = np.full((n_face * n_max_face_edges, 2, 2), INT_FILL_VALUE)
face_edges_lonlat_rad[non_fill_value_mask] = edge_nodes_lonlat_rad[face_edge_connectivity_raveled[non_fill_value_mask]]
face_edges_lonlat_rad = face_edges_lonlat_rad.reshape((n_face, n_max_face_edges, 2, 2))
return face_edges_lonlat_rad
The implementation above is for get_lonlat_rad_face_edge_nodes but can be easily modified to also work for get_cartesian_face_edge_nodes
Can you try out something like this?
edge_node_connectivity = uxgrid.edge_node_connectivity.values node_lon = uxgrid.node_lon.values node_lat = uxgrid.node_lat.values # Longitudes of the nodes that make up each edge in radians: (n_edge, 2) edge_nodes_lon_rad = np.deg2rad(node_lon[edge_node_connectivity]) # Lattitudes of the nodes that make up each edge in radians: (n_edge, 2) edge_nodes_lat_rad = np.deg2rad(node_lat[edge_node_connectivity]) # LonLat pairs for the nodes that make up each edge in radians: (n_edge, 2, 2) edge_nodes_lonlat_rad = np.array([edge_nodes_lon_rad, edge_nodes_lat_rad]).swapaxes(0, 1) face_edge_connectivity = uxgrid.face_edge_connectivity.values # obtain a 1D view face_edge_connectivity_raveled = face_edge_connectivity.ravel() non_fill_value_mask = face_edge_connectivity_raveled != INT_FILL_VALUE face_edges_lonlat_rad = np.full((uxgrid.n_face * uxgrid.n_max_face_edges, 2, 2), INT_FILL_VALUE) face_edges_lonlat_rad[non_fill_value_mask] = edge_nodes_lonlat_rad[face_edge_connectivity_raveled[non_fill_value_mask]] face_edges_lonlat_rad = face_edges_lonlat_rad.reshape((uxgrid.n_face, uxgrid.n_max_face_edges, 2, 2))The result array
face_edges_lonlat_radcontains the LonLat pairs for the edges that make up every face. This implementation here is completely vectorized.Let me know if you need me to clarify anything.
Assuming that
face_edge_connectivityandedge_node_connectivityare already constructed, this executes in about 200ms on my laptop for the MPAS mesh that I described above.A slightly cleaned up version below.
edge_node_connectivity = uxgrid.edge_node_connectivity.values node_lon = uxgrid.node_lon.values node_lat = uxgrid.node_lat.values face_edge_connectivity = uxgrid.face_edge_connectivity.values n_face = uxgrid.n_face n_max_face_edges = uxgrid.n_max_face_edges def foo(node_lon, node_lat, edge_node_connectivity, face_edge_connectivity, n_face, n_max_face_edges): edge_nodes_lon_rad = np.deg2rad(node_lon[edge_node_connectivity]) # (n_edge, 2) edge_nodes_lat_rad = np.deg2rad(node_lat[edge_node_connectivity]) # (n_edge, 2, 2) edge_nodes_lonlat_rad = np.array([edge_nodes_lon_rad, edge_nodes_lat_rad]).swapaxes(0, 1) face_edge_connectivity_raveled = face_edge_connectivity.ravel() non_fill_value_mask = face_edge_connectivity_raveled != INT_FILL_VALUE face_edges_lonlat_rad = np.full((n_face * n_max_face_edges, 2, 2), INT_FILL_VALUE) face_edges_lonlat_rad[non_fill_value_mask] = edge_nodes_lonlat_rad[face_edge_connectivity_raveled[non_fill_value_mask]] face_edges_lonlat_rad = face_edges_lonlat_rad.reshape((n_face, n_max_face_edges, 2, 2)) return face_edges_lonlat_radThe implementation above is for
get_lonlat_rad_face_edge_nodesbut can be easily modified to also work forget_cartesian_face_edge_nodes
-
The
face_edge_connectivityis unnecessary here since all the information is stored in theface_nodes_connectivity, Theface_edge_connectivityis not directed and is just simplify built by concatenating theface_nodes_connectivity -
This function doesn't seem to be able to produce the results I need, because I need the LonLat pairs for the edges to be "directed", for example: [[0,1],[1,5],[5,4],[4,0]], each edge's first node is connected to the last edge last node and comes back to the begin node. That's why I have to use my current implementation to ensure the order. And I didn't notice any operation in codes to ensure such order. @amberchen122 , Can you help me look into this and see if the results are different please? Thanks
@hongyuchen1030
I didn't consider the ordering. I think I can update the implementation I provided to take that into account. Looking into it!
I think this should work.
def build_face_edges_lonlat_rad(face_node_conn, n_nodes_per_face, n_face, n_max_face_edges, node_lon, node_lat):
face_node_conn_shift = np.roll(face_node_conn, -1, axis=1)
for i, final_node_idx in enumerate(n_nodes_per_face):
face_node_conn_shift[i, final_node_idx-1] = face_node_conn[i, 0]
face_edge_conn = np.array([face_node_conn, face_node_conn_shift]).T.swapaxes(0, 1)
face_edge_conn_ravel = face_edge_conn.reshape((n_face * n_max_face_edges, 2))
non_fill_value_mask = face_edge_conn_ravel[:, 0] != INT_FILL_VALUE
nod_lon_rad = np.deg2rad(node_lon)
node_lat_rad = np.deg2rad(node_lat)
edge_node_lon_a = nod_lon_rad[face_edge_conn_ravel[:, 0][non_fill_value_mask]]
edge_node_lat_a = node_lat_rad[face_edge_conn_ravel[:, 0][non_fill_value_mask]]
edge_node_lon_b = nod_lon_rad[face_edge_conn_ravel[:, 1][non_fill_value_mask]]
edge_node_lat_b = node_lat_rad[face_edge_conn_ravel[:, 1][non_fill_value_mask]]
face_edges_lonlat_rad = np.full((n_face * n_max_face_edges, 2, 2), np.nan)
face_edges_lonlat_rad[:, 0][non_fill_value_mask] = np.vstack([edge_node_lon_a, edge_node_lat_a]).T
face_edges_lonlat_rad[:, 1][non_fill_value_mask] = np.vstack([edge_node_lon_b, edge_node_lat_b]).T
face_edges_lonlat_rad = face_edges_lonlat_rad.reshape((n_face, n_max_face_edges, 2, 2))
return face_edges_lonlat_rad
@hongyuchen1030
Here is what the output would look like for a single face (i.e. face_edges_lonlat_rad[0])
This face has 5/10 of the maximum edges, with the rest being filled with np.nan.
Execution time is under 1s for me on my local machine on the ~650k face mesh.
Does this output look correct?
I plotted a few of the edges, and the results seem to look correct.
Example of a single face. Edge ordering is [Red, Blue, Green, Yellow]
I think this should work.
def build_face_edges_lonlat_rad(face_node_conn, n_nodes_per_face, n_face, n_max_face_edges, node_lon, node_lat): face_node_conn_shift = np.roll(face_node_conn, -1, axis=1) for i, final_node_idx in enumerate(n_nodes_per_face): face_node_conn_shift[i, final_node_idx-1] = face_node_conn[i, 0] face_edge_conn = np.array([face_node_conn, face_node_conn_shift]).T.swapaxes(0, 1) face_edge_conn_ravel = face_edge_conn.reshape((n_face * n_max_face_edges, 2)) non_fill_value_mask = face_edge_conn_ravel[:, 0] != INT_FILL_VALUE nod_lon_rad = np.deg2rad(node_lon) node_lat_rad = np.deg2rad(node_lat) edge_node_lon_a = nod_lon_rad[face_edge_conn_ravel[:, 0][non_fill_value_mask]] edge_node_lat_a = node_lat_rad[face_edge_conn_ravel[:, 0][non_fill_value_mask]] edge_node_lon_b = nod_lon_rad[face_edge_conn_ravel[:, 1][non_fill_value_mask]] edge_node_lat_b = node_lat_rad[face_edge_conn_ravel[:, 1][non_fill_value_mask]] face_edges_lonlat_rad = np.full((n_face * n_max_face_edges, 2, 2), np.nan) face_edges_lonlat_rad[:, 0][non_fill_value_mask] = np.vstack([edge_node_lon_a, edge_node_lat_a]).T face_edges_lonlat_rad[:, 1][non_fill_value_mask] = np.vstack([edge_node_lon_b, edge_node_lat_b]).T face_edges_lonlat_rad = face_edges_lonlat_rad.reshape((n_face, n_max_face_edges, 2, 2)) return face_edges_lonlat_rad@hongyuchen1030
Here is what the output would look like for a single face (i.e.
face_edges_lonlat_rad[0])This face has 5/10 of the maximum edges, with the rest being filled with `np.nan`.
Execution time is under 1s for me on my local machine on the ~650k face mesh.
Does this output look correct?
Do these faces have the same order as the face_nodes_connectivity?
And I feel like the following lines can be replaced with my implementation in the build_single_face_cart without using for loop.
for i, final_node_idx in enumerate(n_nodes_per_face):
face_node_conn_shift[i, final_node_idx-1] = face_node_conn[i, 0]
@amberchen122 , Can you look into this new implementation, please? You can compare their runtime in the test case. See if the results are correct. And finally, see if you can combine these two methods to avoid any for loop. Thanks
Do these faces have the same order as the face_nodes_connectivity
Yes, the order is preserved.
Do these faces have the same order as the face_nodes_connectivity?
And I feel like the following lines can be replaced with my implementation in the build_single_face_cart without using for loop.
for i, final_node_idx in enumerate(n_nodes_per_face): face_node_conn_shift[i, final_node_idx-1] = face_node_conn[i, 0]
It might be able to be replaced, but I think the impact on performance here is minimal with it, and this ensures that the final value loops back performance for all geometries in the case of mixed-geometry meshes.
I'm getting an error running _get_lonlat_rad_face_edge_nodes on a mixed-topology mesh (quads, pentagons, and hexagons)
I'm only able to get it to run for meshes where all elements have the same dimension.
Here's one of the meshes that is failing uxarray/test/meshfiles/mpas/QU/oQU480.231010.nc
Ran a test on a grid that works for both implmentations (48600 faces, 97200 edges, 48602 nodes)
uxarray/test/meshfiles/esmf/ne30/ne30pg3.grid.nc
Current Implementation
- ~500ms
My Implementation
- ~17ms
Do these faces have the same order as the face_nodes_connectivity
Yes, the order is preserved.
Do these faces have the same order as the face_nodes_connectivity? And I feel like the following lines can be replaced with my implementation in the build_single_face_cart without using for loop.
for i, final_node_idx in enumerate(n_nodes_per_face): face_node_conn_shift[i, final_node_idx-1] = face_node_conn[i, 0]It might be able to be replaced, but I think the impact on performance here is minimal with it, and this ensures that the final value loops back performance for all geometries in the case of mixed-geometry meshes.
I'm getting an error running
_get_lonlat_rad_face_edge_nodeson a mixed-topology mesh (quads, pentagons, and hexagons)I'm only able to get it to run for meshes where all elements have the same dimension.
Here's one of the meshes that is failing
uxarray/test/meshfiles/mpas/QU/oQU480.231010.nc
I already put this into my test case yesterday. And it passed perfectly
Do these faces have the same order as the face_nodes_connectivity
Yes, the order is preserved.
Do these faces have the same order as the face_nodes_connectivity? And I feel like the following lines can be replaced with my implementation in the build_single_face_cart without using for loop.
for i, final_node_idx in enumerate(n_nodes_per_face): face_node_conn_shift[i, final_node_idx-1] = face_node_conn[i, 0]It might be able to be replaced, but I think the impact on performance here is minimal with it, and this ensures that the final value loops back performance for all geometries in the case of mixed-geometry meshes.
You don't need to use a for loop to ensure that, just simply do the following or similar things. As long as the face_nodes doesn't have any filled values (which is already be cleaned up using the mask), it will work.
The bug you mentioned should already be fixed, and it's unrelated to the following nodes' order alginment
# Do the vectorized check for counter-clockwise order of edge nodes
face_edges_connectivity[:, 0] = face_nodes
face_edges_connectivity[:, 1] = np.roll(face_nodes, -1)
# Ensure the last edge connects back to the first node to complete the loop
face_edges_connectivity[-1] = [face_nodes[-1], face_nodes[0]]
Do these faces have the same order as the face_nodes_connectivity
Yes, the order is preserved.
Do these faces have the same order as the face_nodes_connectivity? And I feel like the following lines can be replaced with my implementation in the build_single_face_cart without using for loop.
for i, final_node_idx in enumerate(n_nodes_per_face): face_node_conn_shift[i, final_node_idx-1] = face_node_conn[i, 0]It might be able to be replaced, but I think the impact on performance here is minimal with it, and this ensures that the final value loops back performance for all geometries in the case of mixed-geometry meshes.
You don't need to use a for loop to ensure that, just simply do the following or similar things. As long as the face_nodes doesn't have any filled values (which is already be cleaned up using the mask), it will work.
The bug you mentioned should already be fixed, and it's unrelated to the following nodes' order alginment
# Do the vectorized check for counter-clockwise order of edge nodes face_edges_connectivity[:, 0] = face_nodes face_edges_connectivity[:, 1] = np.roll(face_nodes, -1) # Ensure the last edge connects back to the first node to complete the loop face_edges_connectivity[-1] = [face_nodes[-1], face_nodes[0]]
Do these faces have the same order as the face_nodes_connectivity
Yes, the order is preserved.
Do these faces have the same order as the face_nodes_connectivity? And I feel like the following lines can be replaced with my implementation in the build_single_face_cart without using for loop.
for i, final_node_idx in enumerate(n_nodes_per_face): face_node_conn_shift[i, final_node_idx-1] = face_node_conn[i, 0]It might be able to be replaced, but I think the impact on performance here is minimal with it, and this ensures that the final value loops back performance for all geometries in the case of mixed-geometry meshes.
You don't need to use a for loop to ensure that, just simply do the following or similar things. As long as the face_nodes doesn't have any filled values (which is already be cleaned up using the mask), it will work.
The bug you mentioned should already be fixed, and it's unrelated to the following nodes' order alginment
# Do the vectorized check for counter-clockwise order of edge nodes face_edges_connectivity[:, 0] = face_nodes face_edges_connectivity[:, 1] = np.roll(face_nodes, -1) # Ensure the last edge connects back to the first node to complete the loop face_edges_connectivity[-1] = [face_nodes[-1], face_nodes[0]]
This implementation is only for a single face though, correct? The vectorization here seems to within the _get_lonlat_rad_single_face_edge_nodes function, and it is still called n_face times within _get_lonlat_rad_face_edge_nodes
Do these faces have the same order as the face_nodes_connectivity
Yes, the order is preserved.
Do these faces have the same order as the face_nodes_connectivity? And I feel like the following lines can be replaced with my implementation in the build_single_face_cart without using for loop.
for i, final_node_idx in enumerate(n_nodes_per_face): face_node_conn_shift[i, final_node_idx-1] = face_node_conn[i, 0]It might be able to be replaced, but I think the impact on performance here is minimal with it, and this ensures that the final value loops back performance for all geometries in the case of mixed-geometry meshes.
You don't need to use a for loop to ensure that, just simply do the following or similar things. As long as the face_nodes doesn't have any filled values (which is already be cleaned up using the mask), it will work. The bug you mentioned should already be fixed, and it's unrelated to the following nodes' order alginment
# Do the vectorized check for counter-clockwise order of edge nodes face_edges_connectivity[:, 0] = face_nodes face_edges_connectivity[:, 1] = np.roll(face_nodes, -1) # Ensure the last edge connects back to the first node to complete the loop face_edges_connectivity[-1] = [face_nodes[-1], face_nodes[0]]Do these faces have the same order as the face_nodes_connectivity
Yes, the order is preserved.
Do these faces have the same order as the face_nodes_connectivity? And I feel like the following lines can be replaced with my implementation in the build_single_face_cart without using for loop.
for i, final_node_idx in enumerate(n_nodes_per_face): face_node_conn_shift[i, final_node_idx-1] = face_node_conn[i, 0]It might be able to be replaced, but I think the impact on performance here is minimal with it, and this ensures that the final value loops back performance for all geometries in the case of mixed-geometry meshes.
You don't need to use a for loop to ensure that, just simply do the following or similar things. As long as the face_nodes doesn't have any filled values (which is already be cleaned up using the mask), it will work. The bug you mentioned should already be fixed, and it's unrelated to the following nodes' order alginment
# Do the vectorized check for counter-clockwise order of edge nodes face_edges_connectivity[:, 0] = face_nodes face_edges_connectivity[:, 1] = np.roll(face_nodes, -1) # Ensure the last edge connects back to the first node to complete the loop face_edges_connectivity[-1] = [face_nodes[-1], face_nodes[0]]This implementation is only for a single face though, correct?
Correct, that's why I asked @amberchen122 to see IF she can ADAPT it. If it cannot be, we can still use the simple for loop to get everything done. I just feel like this nodes shift should be able to vectorized as well. Since we are doing the optimization in this PR, we don't want to leave something unfinished. So that we don't have to look back again
Correct, that's why I asked @amberchen122 to see IF she can ADAPT it. If it cannot be, we can still use the simple for loop to get everything done. I just feel like this nodes shift should be able to vectorized as well. Since we are doing the optimization in this PR, we don't want to leave something unfinished. So that we don't have to look back again
100% agree.
My main concern with the implementation here is that it isn't truly vectorized, since we still have n_face calls to _get_lonlat_rad_single_face_edge_nodes.
The implementation I provided would eliminate the need to have a _get_lonlat_rad_single_face_edge_nodes, since the entire array would be processed with a single function.
Correct, that's why I asked @amberchen122 to see IF she can ADAPT it. If it cannot be, we can still use the simple for loop to get everything done. I just feel like this nodes shift should be able to vectorized as well. Since we are doing the optimization in this PR, we don't want to leave something unfinished. So that we don't have to look back again
100% agree.
My main concern with the implementation here is that it isn't truly vectorized, since we still have
n_facecalls to_get_lonlat_rad_single_face_edge_nodes.The implementation I provided would eliminate the need to have a
_get_lonlat_rad_single_face_edge_nodes, since the entire array would be processed with a single function.
I understand. This helper function was assigned to Aaron but it wasn't handled efficiently. And I had to re-implement it again so that @amberchen122 can proceed more efficiently for the zonal average. My main purpose of the current implementation was not for the full vectorization, it was to provide a correct APIs call way so that she can use it right now. That's why the outer function still call it n_faces time because of my cursory work.
Since we now want to fully optimize this functionality, we can spend some time digging deep into the entire function. And that's why I feel like combining your work for the global face process and my work on the single face process will be a good idea. And we want to get it properly done this time so that we don't have to go back and re-design it again.
https://github.com/UXARRAY/uxarray/blob/ad9af81f292cfdf28ac77d3b4c23017110f93f8f/uxarray/grid/geometry.py#L884-L900
These can be updated to call the vectorized implementations, moving the function call right before the for loop.
https://github.com/UXARRAY/uxarray/blob/ad9af81f292cfdf28ac77d3b4c23017110f93f8f/uxarray/grid/geometry.py#L884-L900
These can be updated to call the vectorized implementations, moving the function call right before the for loop.
https://github.com/UXARRAY/uxarray/blob/ad9af81f292cfdf28ac77d3b4c23017110f93f8f/uxarray/grid/geometry.py#L884-L900
These can be updated to call the vectorized implementations, moving the function call right before the for loop.
I don't think that's the latest code. The calls have already been moved out from the for-loop
_get_cartesian_face_edge_nodes
EDIT: You're right, I was looking at the wrong file. My bad!
This still produces a total of n_face calls to _get_cartesian_face_edge_nodes, since we are indexing the face_node_connectivity and face_edge_connectivity
Instead of operating on slices, we should compute the entire face_edges_cartesian and face_edges_lonlat_rad arrays prior to entering the for loop.
face_edges_cartesian = _get_cartesian_face_edge_nodes(...)
face_edges_lonlat_rad = _get_lonlat_rad_face_edge_nodes(...)
for i in range(n_face):
cur_face_edges_cart = face_edges_cartesian[i]
cur_face_edges_lonlat_rad = face_edges_lonlat_rad[i]
...
Having a total of n_face indexes as opposed to n_face function calls will significantly improve the performance here, roughly by a factor of 20-30x from my initial testing.
The next step, which is out of the scope of this PR, would be to allow _poplate_face_latlon_bound() to operate on the entire arrays, as opposed to on each individual face.
https://github.com/UXARRAY/uxarray/blob/ad9af81f292cfdf28ac77d3b4c23017110f93f8f/uxarray/grid/geometry.py#L906-L911
Eventually, this entire loop won't be needed when we are completely vectorized.
https://github.com/UXARRAY/uxarray/blob/ad9af81f292cfdf28ac77d3b4c23017110f93f8f/uxarray/grid/geometry.py#L884
_get_cartesian_face_edge_nodes
EDIT: You're right, I was looking at the wrong file. My bad!
This still produces a total of
n_facecalls to_get_cartesian_face_edge_nodes, since we are indexing theface_node_connectivityandface_edge_connectivityInstead of operating on slices, we should compute the entire
face_edges_cartesianandface_edges_lonlat_radarrays prior to entering the for loop.face_edges_cartesian = _get_cartesian_face_edge_nodes(...) face_edges_lonlat_rad = _get_lonlat_rad_face_edge_nodes(...) for i in range(n_face): cur_face_edges_cart = face_edges_cartesian[i] cur_face_edges_lonlat_rad = face_edges_lonlat_rad[i] ...Having a total of
n_faceindexes as opposed ton_facefunction calls will significantly improve the performance here, roughly by a factor of 20-30x from my initial testing.The next step, which is out of the scope of this PR, would be to allow
_poplate_face_latlon_bound()to operate on the entire arrays, as opposed to on each individual face.https://github.com/UXARRAY/uxarray/blob/ad9af81f292cfdf28ac77d3b4c23017110f93f8f/uxarray/grid/geometry.py#L906-L911
Eventually, this entire loop won't be needed when we are completely vectorized.
https://github.com/UXARRAY/uxarray/blob/ad9af81f292cfdf28ac77d3b4c23017110f93f8f/uxarray/grid/geometry.py#L884
We have discussed that before, about the vectorization and optimization.
Our goal is to do agile development while optimizing the existing function: That is we want to produce somethings that users can use ASAP, And then we looked back to optimizing it. And as I mentioned before, I can only provide a working function with correct algorithms, and I need your help to optimize them.
But as we can see here, going back to modify the API design will require massive changes in the codes, leaking potential inconsistency. That's why I asked you to help me build the API boilerplate: So that in the future, we can just modify what's inside the function without touching the API design.
In other words, when setting up the API structure, we should think about the possible future vectorization.
Now looking back for the _get_cartesian_face_edge_nodes and _get_lonlat_rad_face_edge_nodes: These two functions will be used in the _populate_bounds and zonal_mean. Can you come out a desired API design that can satisfy our future need( the vectorization of _populate_bounds and zonal_mean)? So that @amberchen122 can start putting everything together? Thank you very much
Hi @philipc2,
As we work on vectorizing the _populate_bounds and zonal_mean methods, I'd appreciate some clarifications on the API design to ensure that my contributions are contextually aligned with our project's goals. Here are specific aspects where your insights are crucial:
Zonal Mean API Enhancements:
- Could you clarify if the zonal_mean method should automatically attach its results to the existing grid, or if we should provide an option to return the results separately?
- Should we allow users to specify custom latitude ranges for the zonal mean calculation, beyond the standard -90 to 90 degrees?
Populate Bounds Functionality:
Is there anticipated change or current requirement in the _populate_bounds API that _get_cartesian_face_edge_nodes and _get_lonlat_rad_face_edge_nodes should be aware of?
Thank you for all the helpful comments as I get familiar with the UxArray project.
Best, Amber :)
Could you clarify if the zonal_mean method should automatically attach its results to the existing grid, or if we should provide an option to return the results separately?
The result will not be mapped to a grid dimension, but it should be returned as a UxDataArray, which will have a reference to the original grid.
Should we allow users to specify custom latitude ranges for the zonal mean calculation, beyond the standard -90 to 90 degrees?
Yes, it should follow the design here https://github.com/UXARRAY/uxarray/pull/785#issuecomment-2108637972
I have updated _get_cartesian_face_edge_nodes and _get_lonlat_rad_face_edge_nodes with the proposed vectorization and revised the test cases accordingly. The update successfully passed two out of three tests:
Test Output
(uxarray_build) amber@Amber-Desktop:~/git/uxarray$ pytest -k TestFaceEdgeConnectivityHelper test/test_helpers.py ========================================================================================= test session starts ========================================================================================== platform linux -- Python 3.11.9, pytest-8.2.0, pluggy-1.5.0 rootdir: /home/amber/git/uxarray configfile: pyproject.toml plugins: cov-5.0.0 collected 16 items / 13 deselected / 3 selectedtest/test_helpers.py F.. [100%]
=============================================================================================== FAILURES =============================================================================================== ____________________________________________________________ TestFaceEdgeConnectivityHelper.test_get_cartesian_face_edge_nodes_filled_value ____________________________________________________________
self = <test.test_helpers.TestFaceEdgeConnectivityHelper testMethod=test_get_cartesian_face_edge_nodes_filled_value>
def test_get_cartesian_face_edge_nodes_filled_value(self):
# Create the vertices for the grid, based around the North Pole
vertices = [[0.5, 0.5, 0.5], [-0.5, 0.5, 0.5], [-0.5, -0.5, 0.5], [0.5, -0.5, 0.5]]
# Normalize the vertices
vertices = [x / np.linalg.norm(x) for x in vertices]
vertices.append([INT_FILL_VALUE, INT_FILL_VALUE, INT_FILL_VALUE])
# Construct the grid from the vertices
grid = ux.Grid.from_face_vertices(vertices, latlon=False)
# Extract the necessary grid data
face_node_conn = grid.face_node_connectivity.values
n_nodes_per_face = np.array([len(face) for face in face_node_conn])
n_face = len(face_node_conn)
n_max_face_edges = max(n_nodes_per_face)
node_x = grid.node_x.values
node_y = grid.node_y.values
node_z = grid.node_z.values
# Call the function to test
face_edges_connectivity_cartesian = _get_cartesian_face_edge_nodes(
face_node_conn, n_nodes_per_face, n_face, n_max_face_edges, node_x, node_y, node_z
)
test/test_helpers.py:297:
face_node_conn = array([[ 3, 1, 0, 2, -9223372036854775808]]), n_nodes_per_face = array([5]), n_face = 1 n_max_face_edges = 5, node_x = array([-0.57735027, -0.57735027, 0.57735027, 0.57735027]), node_y = array([-0.57735027, 0.57735027, -0.57735027, 0.57735027]) node_z = array([0.57735027, 0.57735027, 0.57735027, 0.57735027])
def _get_cartesian_face_edge_nodes(
face_node_conn, n_nodes_per_face, n_face, n_max_face_edges, node_x, node_y, node_z
):
"""Construct an array to hold the edge Cartesian coordinates connectivity
for multiple faces in a grid.
Parameters
----------
face_node_conn : np.ndarray
An array of shape (n_face, n_max_face_edges) containing the node indices for each face.
n_nodes_per_face : np.ndarray
An array of shape (n_face,) indicating the number of nodes for each face.
n_face : int
The number of faces in the grid.
n_max_face_edges : int
The maximum number of edges for any face in the grid.
node_x : np.ndarray
An array of shape (n_nodes,) containing the x-coordinate values of the nodes.
node_y : np.ndarray
An array of shape (n_nodes,) containing the y-coordinate values of the nodes.
node_z : np.ndarray
An array of shape (n_nodes,) containing the z-coordinate values of the nodes.
Returns
-------
face_edges_cartesian : np.ndarray
An array of shape (n_face, n_max_face_edges, 2, 3) containing the Cartesian coordinates of the edges
for each face. It might contain dummy values if the grid has holes.
Examples
--------
>>> face_node_conn = np.array([[0, 1, 2, 3], [0, 1, 2, 3]])
>>> n_nodes_per_face = np.array([4, 4])
>>> n_face = 2
>>> n_max_face_edges = 4
>>> node_x = np.array([[0, 1, 1, 0], [0, 1, 1, 0]])
>>> node_y = np.array([[0, 0, 1, 1], [0, 0, 1, 1]])
>>> node_z = np.array([[0, 0, 0, 0], [0, 0, 0, 0]])
>>> _get_cartesian_face_edge_nodes(face_node_conn, n_nodes_per_face, n_face, n_max_face_edges, node_x, node_y, node_z)
array([[[[0, 0, 0], [1, 0, 0]],
[[1, 0, 0], [1, 1, 0]],
[[1, 1, 0], [0, 1, 0]],
[[0, 1, 0], [0, 0, 0]]],
[[[0, 0, 0], [1, 0, 0]],
[[1, 0, 0], [1, 1, 0]],
[[1, 1, 0], [0, 1, 0]],
[[0, 1, 0], [0, 0, 0]]]])
"""
# Shift node connections to create edge connections
face_node_conn_shift = np.roll(face_node_conn, -1, axis=1)
# Close the loop for each face by connecting the last node to the first node
for i, final_node_idx in enumerate(n_nodes_per_face):
face_node_conn_shift[i, final_node_idx - 1] = face_node_conn[i, 0]
# Construct edge connections by combining original and shifted node connections
face_edge_conn = np.array([face_node_conn, face_node_conn_shift]).T.swapaxes(0, 1)
# Reshape edge connections and create a mask for valid edges
face_edge_conn_ravel = face_edge_conn.reshape((n_face * n_max_face_edges, 2))
non_fill_value_mask = face_edge_conn_ravel[:, 0] != INT_FILL_VALUE
# Extract Cartesian coordinates for the edge nodes using the mask
edge_node_x_a = node_x[face_edge_conn_ravel[:, 0][non_fill_value_mask]]
edge_node_y_a = node_y[face_edge_conn_ravel[:, 0][non_fill_value_mask]]
edge_node_z_a = node_z[face_edge_conn_ravel[:, 0][non_fill_value_mask]]
edge_node_x_b = node_x[face_edge_conn_ravel[:, 1][non_fill_value_mask]]
E IndexError: index -9223372036854775808 is out of bounds for axis 0 with size 4
uxarray/grid/utils.py:258: IndexError
=========================================================================================== warnings summary ===========================================================================================
test/test_helpers.py: 144 warnings
/home/amber/miniconda3/envs/uxarray_build/lib/python3.11/site-packages/pyfma/_main.py:10: DeprecationWarning: np.find_common_type is deprecated. Please use np.result_type or np.promote_types.
See https://numpy.org/devdocs/release/1.25.0-notes.html and the docs for more information. (Deprecated NumPy 1.25)
dtype = np.find_common_type([], [a.dtype, b.dtype, c.dtype])
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ======================================================================================= short test summary info ======================================================================================== FAILED test/test_helpers.py::TestFaceEdgeConnectivityHelper::test_get_cartesian_face_edge_nodes_filled_value - IndexError: index -9223372036854775808 is out of bounds for axis 0 with size 4 ======================================================================= 1 failed, 2 passed, 13 deselected, 144 warnings in 1.32s =======================================================================
I will look into the cause and implement more tests! Will also look into further optimizing the for loop.
I have updated
_get_cartesian_face_edge_nodesand_get_lonlat_rad_face_edge_nodeswith the proposed vectorization and revised the test cases accordingly. The update successfully passed two out of three tests:Test Output (uxarray_build) amber@Amber-Desktop:~/git/uxarray$ pytest -k TestFaceEdgeConnectivityHelper test/test_helpers.py ========================================================================================= test session starts ========================================================================================== platform linux -- Python 3.11.9, pytest-8.2.0, pluggy-1.5.0 rootdir: /home/amber/git/uxarray configfile: pyproject.toml plugins: cov-5.0.0 collected 16 items / 13 deselected / 3 selected test/test_helpers.py F.. [100%]
=============================================================================================== FAILURES =============================================================================================== ____________________________________________________________ TestFaceEdgeConnectivityHelper.test_get_cartesian_face_edge_nodes_filled_value ____________________________________________________________
self = <test.test_helpers.TestFaceEdgeConnectivityHelper testMethod=test_get_cartesian_face_edge_nodes_filled_value>
def test_get_cartesian_face_edge_nodes_filled_value(self): # Create the vertices for the grid, based around the North Pole vertices = [[0.5, 0.5, 0.5], [-0.5, 0.5, 0.5], [-0.5, -0.5, 0.5], [0.5, -0.5, 0.5]] # Normalize the vertices vertices = [x / np.linalg.norm(x) for x in vertices] vertices.append([INT_FILL_VALUE, INT_FILL_VALUE, INT_FILL_VALUE]) # Construct the grid from the vertices grid = ux.Grid.from_face_vertices(vertices, latlon=False) # Extract the necessary grid data face_node_conn = grid.face_node_connectivity.values n_nodes_per_face = np.array([len(face) for face in face_node_conn]) n_face = len(face_node_conn) n_max_face_edges = max(n_nodes_per_face) node_x = grid.node_x.values node_y = grid.node_y.values node_z = grid.node_z.values # Call the function to testface_edges_connectivity_cartesian = _get_cartesian_face_edge_nodes(face_node_conn, n_nodes_per_face, n_face, n_max_face_edges, node_x, node_y, node_z )test/test_helpers.py:297:
face_node_conn = array([[ 3, 1, 0, 2, -9223372036854775808]]), n_nodes_per_face = array([5]), n_face = 1 n_max_face_edges = 5, node_x = array([-0.57735027, -0.57735027, 0.57735027, 0.57735027]), node_y = array([-0.57735027, 0.57735027, -0.57735027, 0.57735027]) node_z = array([0.57735027, 0.57735027, 0.57735027, 0.57735027])
def _get_cartesian_face_edge_nodes( face_node_conn, n_nodes_per_face, n_face, n_max_face_edges, node_x, node_y, node_z ): """Construct an array to hold the edge Cartesian coordinates connectivity for multiple faces in a grid. Parameters ---------- face_node_conn : np.ndarray An array of shape (n_face, n_max_face_edges) containing the node indices for each face. n_nodes_per_face : np.ndarray An array of shape (n_face,) indicating the number of nodes for each face. n_face : int The number of faces in the grid. n_max_face_edges : int The maximum number of edges for any face in the grid. node_x : np.ndarray An array of shape (n_nodes,) containing the x-coordinate values of the nodes. node_y : np.ndarray An array of shape (n_nodes,) containing the y-coordinate values of the nodes. node_z : np.ndarray An array of shape (n_nodes,) containing the z-coordinate values of the nodes. Returns ------- face_edges_cartesian : np.ndarray An array of shape (n_face, n_max_face_edges, 2, 3) containing the Cartesian coordinates of the edges for each face. It might contain dummy values if the grid has holes. Examples -------- >>> face_node_conn = np.array([[0, 1, 2, 3], [0, 1, 2, 3]]) >>> n_nodes_per_face = np.array([4, 4]) >>> n_face = 2 >>> n_max_face_edges = 4 >>> node_x = np.array([[0, 1, 1, 0], [0, 1, 1, 0]]) >>> node_y = np.array([[0, 0, 1, 1], [0, 0, 1, 1]]) >>> node_z = np.array([[0, 0, 0, 0], [0, 0, 0, 0]]) >>> _get_cartesian_face_edge_nodes(face_node_conn, n_nodes_per_face, n_face, n_max_face_edges, node_x, node_y, node_z) array([[[[0, 0, 0], [1, 0, 0]], [[1, 0, 0], [1, 1, 0]], [[1, 1, 0], [0, 1, 0]], [[0, 1, 0], [0, 0, 0]]], [[[0, 0, 0], [1, 0, 0]], [[1, 0, 0], [1, 1, 0]], [[1, 1, 0], [0, 1, 0]], [[0, 1, 0], [0, 0, 0]]]]) """ # Shift node connections to create edge connections face_node_conn_shift = np.roll(face_node_conn, -1, axis=1) # Close the loop for each face by connecting the last node to the first node for i, final_node_idx in enumerate(n_nodes_per_face): face_node_conn_shift[i, final_node_idx - 1] = face_node_conn[i, 0] # Construct edge connections by combining original and shifted node connections face_edge_conn = np.array([face_node_conn, face_node_conn_shift]).T.swapaxes(0, 1) # Reshape edge connections and create a mask for valid edges face_edge_conn_ravel = face_edge_conn.reshape((n_face * n_max_face_edges, 2)) non_fill_value_mask = face_edge_conn_ravel[:, 0] != INT_FILL_VALUE # Extract Cartesian coordinates for the edge nodes using the mask edge_node_x_a = node_x[face_edge_conn_ravel[:, 0][non_fill_value_mask]] edge_node_y_a = node_y[face_edge_conn_ravel[:, 0][non_fill_value_mask]] edge_node_z_a = node_z[face_edge_conn_ravel[:, 0][non_fill_value_mask]]edge_node_x_b = node_x[face_edge_conn_ravel[:, 1][non_fill_value_mask]]E IndexError: index -9223372036854775808 is out of bounds for axis 0 with size 4
uxarray/grid/utils.py:258: IndexError =========================================================================================== warnings summary =========================================================================================== test/test_helpers.py: 144 warnings /home/amber/miniconda3/envs/uxarray_build/lib/python3.11/site-packages/pyfma/_main.py:10: DeprecationWarning: np.find_common_type is deprecated. Please use
np.result_typeornp.promote_types. See https://numpy.org/devdocs/release/1.25.0-notes.html and the docs for more information. (Deprecated NumPy 1.25) dtype = np.find_common_type([], [a.dtype, b.dtype, c.dtype])-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ======================================================================================= short test summary info ======================================================================================== FAILED test/test_helpers.py::TestFaceEdgeConnectivityHelper::test_get_cartesian_face_edge_nodes_filled_value - IndexError: index -9223372036854775808 is out of bounds for axis 0 with size 4 ======================================================================= 1 failed, 2 passed, 13 deselected, 144 warnings in 1.32s =======================================================================
I will look into the cause and implement more tests! Will also look into further optimizing the for loop.
Thank you very much for your help! Seems like issues arise when dealing with the fill-values.
Hi @philipc2 ,
As mentioned during last UXarray meeting, can I have your fill-value test-case for your implementation? Me and @amberchen122 were unable to make your implementation pass the grids with fill values. Thanks
Hi @philipc2 ,
As mentioned during last UXarray meeting, can I have your fill-value test-case for your implementation? Me and @amberchen122 were unable to make your implementation pass the grids with fill values. Thanks
Working on it this week! Apologies for the delay, was on PTO last week.
Hi @philipc2 , As mentioned during last UXarray meeting, can I have your fill-value test-case for your implementation? Me and @amberchen122 were unable to make your implementation pass the grids with fill values. Thanks
Working on it this week! Apologies for the delay, was on PTO last week.
No worries. The codes cannot cope with the fill value case anyway. We have rewrote a new algorithmns
Hi Team,
Here's an update on the progress:
- I have fully vectorized the
get_cartesian_face_edge_nodesandget_lonlat_rad_face_edge_nodesfunctions, and fixed a bug where the previous code did not handleINT_FILL_VALUE. - I updated and passed all tests (including those with
INT_FILL_VALUE) undertest/test_helpers.py::TestFaceEdgeConnectivityHelper. - I updated all calls to the modified functions.
Could I please request a review of these updates to determine if they are ready for merging?
Thanks! Amber
### After
### Comparison
Looks like a solid ~1.6x increase in performance for the `geoflow-small` grid.

### After
### Comparison
Looks like a solid ~1.6x increase in performance for the `geoflow-small` grid.

This face has 5/10 of the maximum edges, with the rest being filled with `np.nan`.
I'm only able to get it to run for meshes where all elements have the same dimension.