uxarray
uxarray copied to clipboard
Dual Mesh Construction
Closes #825
Overview
Constructs the dual mesh of a grid using face_centers and node_face_connectivity
Expected Usage
import uxarray as ux
grid_path = "/path/to/grid.nc"
grid = ux.open_grid(grid_path)
dual = grid.compute_dual()
PR Checklist
General
- [x] An issue is linked created and linked
- [x] Add appropriate labels
- [x] Filled out Overview and Expected Usage (if applicable) sections
Testing
- [x] Adequate tests are created if there is new functionality
- [x] Tests cover all possible logical paths in your function
- [x] Tests are not too basic (such as simply calling a function and nothing else)
Documentation
- [x] Docstrings have been added to all new functions
- [x] Docstrings have updated with any function changes
- [x] Internal functions have a preceding underscore (
_) and have been added todocs/internal_api/index.rst - [x] User functions have been added to
docs/user_api/index.rst
@philipc2 Thoughts on the API? Currently, I return a dual grid object as its own thing. This made sense to me, as changing the original grid would be difficult due to the different dimensions.
@philipc2 Thoughts on the API? Currently, I return a
dualgrid object as its own thing. This made sense to me, as changing the original grid would be difficult due to the different dimensions.
This type of implementation is what I was envisioning, since we don't want anything to be replaced in-place when creating the dual grid.
Let's name it something like Grid.compute_dual(method=...), where we method could indicate what type of dual we'd like to compute.
We should add functionality that can also convert a UxDataset or UxDataArray into it's "dual" form, which would involve making any face-centered data variable now node-centered, and vice-versa. This would however cause issues if the data variable is edge centered, since those edges would no longer exist.
This was the block that we were able to get working in the Notebook.
import uxarray as ux
from uxarray.constants import INT_DTYPE, INT_FILL_VALUE
import numpy as np
uxgrid = ux.open_grid("/Users/philipc/uxarray-gold/uxarray-gold/uxarray/test/meshfiles/geos-cs/c12/test-c12.native.nc4")
lonlat_t = [(lon, lat) for lon, lat in zip(uxgrid.node_lon.values, uxgrid.node_lat.values)]
# # Dictionary to track first occurrence and subsequent indices
occurrences = {}
# Iterate through the list and track occurrences
for index, tpl in enumerate(lonlat_t):
if tpl in occurrences:
occurrences[tpl].append((INT_DTYPE(index)))
else:
occurrences[tpl] = [INT_DTYPE(index)]
duplicate_dict = {}
for tpl, indices in occurrences.items():
if len(indices) > 1:
source_idx = indices[0]
for duplicate_idx in indices[1:]:
duplicate_dict[duplicate_idx] = source_idx
new_face_node_connectivity = uxgrid.face_node_connectivity.values.copy().ravel()
for idx, item in enumerate(new_face_node_connectivity):
# O(1)
if item in duplicate_dict:
new_face_node_connectivity[idx] = duplicate_dict[item]
new_face_node_connectivity = new_face_node_connectivity.reshape((uxgrid.n_face, uxgrid.n_max_face_nodes))
node_face_conn = {node_i: [] for node_i in range(uxgrid.n_node)}
for face_i, face_nodes in enumerate(new_face_node_connectivity):
for node_i in face_nodes:
if node_i != ux.INT_FILL_VALUE:
node_face_conn[node_i].append(face_i)
n_max_node_faces = -1
for face_indicies in node_face_conn.values():
if len(face_indicies) > n_max_node_faces:
n_max_node_faces = len(face_indicies)
node_face_connectivity = np.full((uxgrid.n_node, n_max_node_faces), INT_FILL_VALUE)
for node_idx, face_indices in enumerate(node_face_conn.values()):
n_faces = len(face_indices)
node_face_connectivity[node_idx, 0:n_faces] = face_indices
new_uxgrid = ux.Grid.from_topology(uxgrid.node_lon.values,
uxgrid.node_lat.values,
new_face_node_connectivity,
node_face_connectivity=node_face_connectivity)
Thanks @philipc2! I will see if I can help figure out what's going on.
It is definitely something with how you are updating and returning the grid, I did a very basic implementation just as a function inside the grid class, called by compute dual at the start, and it works perfectly. Red is the original grid.
It is definitely something with how you are updating and returning the grid, I did a very basic implementation just as a function inside the grid class, called by compute dual at the start, and it works perfectly. Red is the original grid.
Right, it'd be better to compare the grid objects returned in each case, specifically the merged nodes part in both cases (notebook and installed versions)
Check out this pull request on ![]()
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
It is definitely something with how you are updating and returning the grid, I did a very basic implementation just as a function inside the grid class, called by compute dual at the start, and it works perfectly. Red is the original grid.
Yeah, I'm almost wondering if there might be a bug somewhere in our Grid.from_topology(). Going to do some more digging.
It is definitely something with how you are updating and returning the grid, I did a very basic implementation just as a function inside the grid class, called by compute dual at the start, and it works perfectly. Red is the original grid.
Yeah, I'm almost wondering if there might be a bug somewhere in our
Grid.from_topology(). Going to do some more digging.
It's working now though? I think it was just a problem with pip install . I have been having trouble getting it to work. That's what the notebook is for, just testing.
This can come out of draft now?
@philipc2 Are you fine with me splitting this into two PRs and just handling the dual mesh here? That way we can just focus on that and get it merged instead of worrying about why the duplication stuff isn't working?
@philipc2 Are you fine with me splitting this into two PRs and just handling the dual mesh here? That way we can just focus on that and get it merged instead of worrying about why the duplication stuff isn't working?
Hi @aaronzedwick
I had a local branch with the changes split, let me get that pushed and set up as a PR.
@philipc2 Are you fine with me splitting this into two PRs and just handling the dual mesh here? That way we can just focus on that and get it merged instead of worrying about why the duplication stuff isn't working?
Hi @aaronzedwick
I had a local branch with the changes split, let me get that pushed and set up as a PR.
Awesome, thank you Philip!
ASV Benchmarking
Benchmark Comparison Results
Benchmarks that have improved:
| Change | Before [ec5cc8f2] | After [cddbb76b] | Ratio | Benchmark (Parameter) |
|---|---|---|---|---|
| - | 2.19±0.01ms | 1.87±0.04ms | 0.85 | mpas_ocean.ConnectivityConstruction.time_n_nodes_per_face('120km') |
| failed | 122±0.9ms | n/a | mpas_ocean.DualMesh.time_dual_mesh_construction('120km') | |
| failed | 8.81±0.4ms | n/a | mpas_ocean.DualMesh.time_dual_mesh_construction('480km') | |
| - | 465M | 409M | 0.88 | mpas_ocean.Integrate.peakmem_integrate('480km') |
Benchmarks that have stayed the same:
| Change | Before [ec5cc8f2] | After [cddbb76b] | Ratio | Benchmark (Parameter) |
|---|---|---|---|---|
| 438M | 441M | 1.01 | face_bounds.FaceBounds.peakmem_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/mpas/QU/oQU480.231010.nc')) | |
| 442M | 441M | 1.00 | face_bounds.FaceBounds.peakmem_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/scrip/outCSne8/outCSne8.nc')) | |
| 444M | 444M | 1.00 | face_bounds.FaceBounds.peakmem_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/ugrid/geoflow-small/grid.nc')) | |
| 443M | 442M | 1.00 | face_bounds.FaceBounds.peakmem_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/ugrid/quad-hexagon/grid.nc')) | |
| 1.59±0s | 1.58±0s | 1.00 | face_bounds.FaceBounds.time_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/mpas/QU/oQU480.231010.nc')) | |
| 222±2ms | 220±0.1ms | 0.99 | face_bounds.FaceBounds.time_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/scrip/outCSne8/outCSne8.nc')) | |
| 2.07±0.01s | 1.99±0s | 0.96 | face_bounds.FaceBounds.time_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/ugrid/geoflow-small/grid.nc')) | |
| 7.81±0.1ms | 7.48±0.2ms | 0.96 | face_bounds.FaceBounds.time_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/ugrid/quad-hexagon/grid.nc')) | |
| 3.08±0.05s | 3.06±0.03s | 0.99 | import.Imports.timeraw_import_uxarray | |
| 665±7μs | 668±7μs | 1.00 | mpas_ocean.CheckNorm.time_check_norm('120km') | |
| 440±3μs | 428±3μs | 0.97 | mpas_ocean.CheckNorm.time_check_norm('480km') | |
| 666±30ms | 652±6ms | 0.98 | mpas_ocean.ConnectivityConstruction.time_face_face_connectivity('120km') | |
| 41.5±1ms | 40.8±0.3ms | 0.98 | mpas_ocean.ConnectivityConstruction.time_face_face_connectivity('480km') | |
| 516±5μs | 509±10μs | 0.99 | mpas_ocean.ConnectivityConstruction.time_n_nodes_per_face('480km') | |
| 6.96±0.2ms | 6.75±0.03ms | 0.97 | mpas_ocean.ConstructFaceLatLon.time_cartesian_averaging('120km') | |
| 3.62±0.05ms | 3.56±0.04ms | 0.98 | mpas_ocean.ConstructFaceLatLon.time_cartesian_averaging('480km') | |
| 3.61±0.01s | 3.58±0s | 0.99 | mpas_ocean.ConstructFaceLatLon.time_welzl('120km') | |
| 228±2ms | 230±3ms | 1.01 | mpas_ocean.ConstructFaceLatLon.time_welzl('480km') | |
| 1.29±0μs | 1.25±0μs | 0.97 | mpas_ocean.ConstructTreeStructures.time_ball_tree('120km') | |
| 317±4ns | 324±20ns | 1.02 | mpas_ocean.ConstructTreeStructures.time_ball_tree('480km') | |
| 826±2ns | 871±10ns | 1.05 | mpas_ocean.ConstructTreeStructures.time_kd_tree('120km') | |
| 306±5ns | 304±20ns | 0.99 | mpas_ocean.ConstructTreeStructures.time_kd_tree('480km') | |
| 1.07±0.01s | 1.05±0.01s | 0.98 | mpas_ocean.GeoDataFrame.time_to_geodataframe('120km', False) | |
| 57.4±3ms | 54.5±0.5ms | 0.95 | mpas_ocean.GeoDataFrame.time_to_geodataframe('120km', True) | |
| 80.9±1ms | 78.8±0.8ms | 0.97 | mpas_ocean.GeoDataFrame.time_to_geodataframe('480km', False) | |
| 5.54±0.3ms | 5.46±0.04ms | 0.99 | mpas_ocean.GeoDataFrame.time_to_geodataframe('480km', True) | |
| 318M | 318M | 1.00 | mpas_ocean.Gradient.peakmem_gradient('120km') | |
| 298M | 295M | 0.99 | mpas_ocean.Gradient.peakmem_gradient('480km') | |
| 2.73±0.05ms | 2.68±0.02ms | 0.98 | mpas_ocean.Gradient.time_gradient('120km') | |
| 298±2μs | 286±2μs | 0.96 | mpas_ocean.Gradient.time_gradient('480km') | |
| 249±10μs | 224±9μs | ~0.90 | mpas_ocean.HoleEdgeIndices.time_construct_hole_edge_indices('120km') | |
| 125±1μs | 122±0.8μs | 0.98 | mpas_ocean.HoleEdgeIndices.time_construct_hole_edge_indices('480km') | |
| 424M | 425M | 1.00 | mpas_ocean.Integrate.peakmem_integrate('120km') | |
| 173±0.4ms | 179±3ms | 1.03 | mpas_ocean.Integrate.time_integrate('120km') | |
| 12.0±0.1ms | 12.6±0.1ms | 1.05 | mpas_ocean.Integrate.time_integrate('480km') | |
| 349±5ms | 356±8ms | 1.02 | mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('120km', 'exclude') | |
| 353±4ms | 358±10ms | 1.02 | mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('120km', 'include') | |
| 350±5ms | 357±8ms | 1.02 | mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('120km', 'split') | |
| 23.9±0.7ms | 24.5±2ms | 1.02 | mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('480km', 'exclude') | |
| 23.8±1ms | 23.8±0.9ms | 1.00 | mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('480km', 'include') | |
| 24.0±1ms | 24.0±1ms | 1.00 | mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('480km', 'split') | |
| 54.4±0.07ms | 54.7±0.4ms | 1.01 | mpas_ocean.RemapDownsample.time_inverse_distance_weighted_remapping | |
| 44.3±0.1ms | 44.5±0.1ms | 1.00 | mpas_ocean.RemapDownsample.time_nearest_neighbor_remapping | |
| 359±0.5ms | 360±1ms | 1.00 | mpas_ocean.RemapUpsample.time_inverse_distance_weighted_remapping | |
| 263±0.4ms | 264±0.9ms | 1.00 | mpas_ocean.RemapUpsample.time_nearest_neighbor_remapping | |
| 292M | 292M | 1.00 | quad_hexagon.QuadHexagon.peakmem_open_dataset | |
| 291M | 291M | 1.00 | quad_hexagon.QuadHexagon.peakmem_open_grid | |
| 6.77±0.2ms | 7.14±0.7ms | 1.05 | quad_hexagon.QuadHexagon.time_open_dataset | |
| 5.79±0.1ms | 6.03±0.6ms | 1.04 | quad_hexagon.QuadHexagon.time_open_grid |
@philipc2 This should be ready for review, I made a lot of changes and got it down from 8.69s to 127ms for the 120km benchmark.
@philipc2 This should be ready for review, I made a lot of changes and got it down from 8.69s to 127ms for the 120km benchmark.
That's a significant improvement, nicely done!
I'm checking the compatibility with #879 and we can see where to move from there.
Continuing one of my comments above, we should create a UxDataArray.get_dual() method that keeps the data stored under the Data Array, but swaps the mapping such that:
- Face Centered variables become Node Centered
- Node Centered variables become Face Centered
- An error is raised for edge centered variables
Continuing one of my comments above, we should create a
UxDataArray.get_dual()method that keeps the data stored under the Data Array, but swaps the mapping such that:
- Face Centered variables become Node Centered
- Node Centered variables become Face Centered
- An error is raised for edge centered variables
Data array or dataset? We load in datasets more often than anything else, right?
Also, something with the latest commits to main messed up my tests.
Continuing one of my comments above, we should create a
UxDataArray.get_dual()method that keeps the data stored under the Data Array, but swaps the mapping such that:
- Face Centered variables become Node Centered
- Node Centered variables become Face Centered
- An error is raised for edge centered variables
Data array or dataset? We load in datasets more often than anything else, right?
@philipc2
Continuing one of my comments above, we should create a
UxDataArray.get_dual()method that keeps the data stored under the Data Array, but swaps the mapping such that:
- Face Centered variables become Node Centered
- Node Centered variables become Face Centered
- An error is raised for edge centered variables
Data array or dataset? We load in datasets more often than anything else, right?
I would suggest making a DataArray implementation (i.e. for a single data variable), and then also create a Dataset version. The user should be able to call either.
I believe edge-centered data should actually remain the same mapping in the both the dual & primal grids, since the new edge will intersect the original one.
Continuing one of my comments above, we should create a
UxDataArray.get_dual()method that keeps the data stored under the Data Array, but swaps the mapping such that:
- Face Centered variables become Node Centered
- Node Centered variables become Face Centered
- An error is raised for edge centered variables
Data array or dataset? We load in datasets more often than anything else, right?
I would suggest making a DataArray implementation (i.e. for a single data variable), and then also create a Dataset version. The user should be able to call either.
I believe edge-centered data should actually remain the same mapping in the both the dual & primal grids, since the new edge will intersect the original one.
Not if the mesh has holes, then there will be less n_edges.
@philipc2 This is ready for review, especially the notebook I put together would be great to have some feedback on.
Left an initial notebook review. Will give it a deeper look through tomorrow.
is this good to go?
@philipc2 Could we get this merged?

