Oceananigans.jl icon indicating copy to clipboard operation
Oceananigans.jl copied to clipboard

Adds grid metrics to NetCDF output

Open tomchor opened this issue 3 years ago • 12 comments

This PR adds grid spacings to NetCDF out, which is a step closer to solving https://github.com/CliMA/Oceananigans.jl/issues/1334

EDIT:

The goal here is to add grid metrics to NetCDF output. The are two main avenues to follow:

  1. We can follow Oceananigans nomenclature and conventions, which would make the output play more nicely with Oceananigans itself (and more generally in the Julia environment).
  2. We can follow standard community conventions, which would mean the output won't follow Oceananigans naming etc., but it would optimize its readability by other software.

I think we should follow option 2, since if a user wants to work with the output in Oceananigans/Julia, then using JLD2 output is probably the right choice anyway. Given that most people in the community use Python, xarray and xgcm to analyze model output, I think we should optimize the output to work with that ecosystem out of the box. Based on the discussion in https://github.com/CliMA/Oceananigans.jl/issues/1334, it seems the preferred conventions to use are the SGRID conventions

For the more technical aspects, I'm planning on starting with RectilinearGrids and LatLonGrids in this PR since these are more straightforward. And then we can expand from there. I also think this should be presented to the user as an opt-in flag in NetCDFWriter constructor, as opposed to being included in every NetCDF output by default.

tomchor avatar Jul 12 '22 21:07 tomchor

Do we currently have a function to retrieve spacings that considers whether a cell is "wet" or not? Meaning it considers whether a cell is part of the interior of the domain or is a halo/immersed solid cell when calculating Δz? Or maybe there's a way to use abstract operation metrics to do this that I'm not aware?

For example in the script below, I can retrieve Δz directly from grid, but the edges do not reflect the fact that the z direction is bounded.

julia> using Oceananigans

julia> grid = RectilinearGrid(size = (4, 4, 4), x=(0,1), y=(0,1), z=0:0.25:1)
4×4×4 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on CPU with 3×3×3 halo
├── Periodic x ∈ [0.0, 1.0) regularly spaced with Δx=0.25
├── Periodic y ∈ [0.0, 1.0) regularly spaced with Δy=0.25
└── Bounded  z ∈ [0.0, 1.0] variably spaced with min(Δz)=0.25, max(Δz)=0.25

julia> grid.Δzᵃᵃᶠ[1:grid.Nz+1]
5-element Vector{Float64}:
 0.25
 0.25
 0.25
 0.25
 0.25

To be in line with xgcm's standards for metrics the output in the above case (considering only the "wet" or interior or fluid-filled cells when calculating Δz) should be

julia> grid.Δzᵃᵃᶠ[1:grid.Nz+1]
5-element Vector{Float64}:
 0.125
 0.25
 0.25
 0.25
 0.125

since the first and last spacings on a bounded grid are "half inside the domain and half outside".

I looked for ways to do this in that are already in the code but couldn't find anything. I wanted to ask before I started coding something from scratch.

tomchor avatar Jul 12 '22 21:07 tomchor

Hmmm, no we do not have that but we have the inactive_cell, inactive_node and peripheral_node functions, which kind of do that. (would be half if peripheral_node and zero if inactive_node)

simone-silvestri avatar Jul 12 '22 21:07 simone-silvestri

Hmmm, no we do not have that but we have the inactive_cell, inactive_node and peripheral_node functions, which kind of do that. (would be half if peripheral_node and zero if inactive_node)

I don't think this is generically true. This is only true for certain immersed boundaries --- GridFittedBottom and GridFittedBoundary.

glwagner avatar Jul 15 '22 21:07 glwagner

To be in line with https://github.com/xgcm/xgcm/issues/306#issuecomment-776124591 the output in the above case (considering only the "wet" or interior or fluid-filled cells when calculating Δz) should be

julia> grid.Δzᵃᵃᶠ[1:grid.Nz+1]
5-element Vector{Float64}:
 0.125
 0.25
 0.25
 0.25
 0.125

since the first and last spacings on a bounded grid are "half inside the domain and half outside".

This is interesting and I was not aware of this standard. To be honest, I do not understand the rationale for the standard that "chops" the face to face spacing at the boundary. I'm not sure this convention reflects a correct understanding of the staggered grid and it's interaction with the boundary. This does become functionally important when calculating gradients across the boundary (and when halo cells are used to enforce gradient or value boundary conditions). But perhaps I am missing the purpose of the convention?

glwagner avatar Jul 15 '22 21:07 glwagner

More generally though, we do need to design a function-based user interface for extracting grid metrics from any grid. This does not exist and it's not sustainable to access grid properties directly by writing things like grid.Δzᵃᵃᶠ. This method will also produce incorrect results for immersed boundaries that modify grid metrics, such as PartialCellBottom and a hypothetical cut-cell implemenation.

glwagner avatar Jul 15 '22 21:07 glwagner

Perhaps we are ready to revisit this?

glwagner avatar Mar 22 '23 16:03 glwagner

Perhaps we are ready to revisit this?

Yes, I was waiting for us to merge https://github.com/CliMA/Oceananigans.jl/pull/2842. Let's keep this open

tomchor avatar Mar 22 '23 17:03 tomchor

Before writing too much code can we discuss what we need and come up with a design together?

glwagner avatar Apr 20 '23 15:04 glwagner

Before writing too much code can we discuss what we need and come up with a design together?

Yes I agree. I am just pushing some code that I did for my own use as a starting point. I'll post a discussion starter soon.

tomchor avatar Apr 20 '23 15:04 tomchor

Now that the spacing functions are in place, I'm going to start working on this. (I'll update the top post with the information below.)

The goal here is to add grid metrics to NetCDF output. The are two main avenues to follow:

  1. We can follow Oceananigans nomenclature and conventions, which would make the output play more nicely with Oceananigans itself (and more generally in the Julia environment).
  2. We can follow standard community conventions, which would mean the output won't follow Oceananigans naming etc., but it would optimize its readability by other software.

I think we should follow option 2, since if a user wants to work with the output in Oceananigans/Julia, then using JLD2 output is probably the right choice anyway. Given that most people in the community use Python, xarray and xgcm to analyze model output, I think we should optimize the output to work with that ecosystem out of the box. Based on the discussion in https://github.com/CliMA/Oceananigans.jl/issues/1334, it seems the preferred conventions to use are the SGRID conventions

For the more technical aspects, I'm planning on starting with RectilinearGrids and LatLonGrids in this PR since these are more straightforward. And then we can expand from there. I also think this should be presented to the user as an opt-in flag in NetCDFWriter constructor, as opposed to being included in every NetCDF output by default.

tomchor avatar Apr 20 '23 16:04 tomchor

@jbusecke I hope you don't mind me recruiting your quick help here! I have a couple of questions regarding SGRID and xgcm, just to make sure I'm understanding things right:

Do things still work like your comment here describes? Meaning the metrics are still describing properties surrounding the point?

Also (related), does the treatment of spacings around boundary points still follow what you described here? (And what I tried to illustrate here for Oceananigans?)

Given these two points, can I build areas and volumes just by doing the normal operations with the correctly-defined spacings in each point?

Thanks!

tomchor avatar Apr 20 '23 16:04 tomchor

Do we currently have a function to retrieve spacings that considers whether a cell is "wet" or not?

inactive_node(i, j, k, grid, lx, ly, lz)

returns true if a cell is immersed / inactive, and false if a cell is active.

The boundary points are special, because their status depends on whether they are a prognostic or diagnostic field. Therefore we have another function,

peripheral_node(i, j, k, grid, lx, ly, lz)

which returns true if a cell is inactive or if it lies on the boundary between active and inactive (ie, the "periphery" of the domain). This distinction is only meaningful for Face-centered fields: Center locations cannot lie on the boundary.

julia> using Oceananigans

julia> grid = RectilinearGrid(size = (4, 4, 4), x=(0,1), y=(0,1), z=0:0.25:1)
4×4×4 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on CPU with 3×3×3 halo
├── Periodic x ∈ [0.0, 1.0) regularly spaced with Δx=0.25
├── Periodic y ∈ [0.0, 1.0) regularly spaced with Δy=0.25
└── Bounded  z ∈ [0.0, 1.0] variably spaced with min(Δz)=0.25, max(Δz)=0.25

julia> grid.Δzᵃᵃᶠ[1:grid.Nz+1]
5-element Vector{Float64}:
 0.25
 0.25
 0.25
 0.25
 0.25

To be in line with xgcm's standards for metrics the output in the above case (considering only the "wet" or interior or fluid-filled cells when calculating Δz) should be

julia> grid.Δzᵃᵃᶠ[1:grid.Nz+1]
5-element Vector{Float64}:
 0.125
 0.25
 0.25
 0.25
 0.125

since the first and last spacings on a bounded grid are "half inside the domain and half outside".

Are you sure? For immersed boundaries, this would mean that all the metrics must be 3D arrays even when the grid directions are separable --- greatly inflating the size of your output file.

Or maybe there's a way to use abstract operation metrics to do this that I'm not aware?

It is simple. Write a KernelFunctionOperation that checks if a node is peripheral_node. If true, divide the metric by 2, otherwise, return the metric unchanged.

However, as noted above, I'm not sure this is what you want.

Perhaps there is a purpose to the convention that divides the cell sizes on the boundary by 2. However, I'm worried this could be misleading regarding how the staggered finite volume grid and its diagnostics are interpreted. Another concern is that modifying metrics prior to output will lead to difficulties in reproducible diagnostics in the future between xgcm computations and native Oceanangians computations. So in this case I hope we can keep things simple and simply save whatever xspacings, etc outputs, if possible.

glwagner avatar Apr 20 '23 16:04 glwagner