MetPy icon indicating copy to clipboard operation
MetPy copied to clipboard

Automated Field Solver

Open dopplershift opened this issue 2 years ago • 2 comments

Description Of Changes

Finally have a really solid cut at the last major deliverable from the last grant, and at long last close out #3. The implementation, once I got my head around it, was remarkably straightforward--owing to how much other infrastructure we have developed in xarray + units. The major pieces:

  • A registry for the functions with a decorator to mark functions. This decorator can take a full list of all return values and input fields. I also allowed it to just use the parameter names (since we've been nicely consistent) where possible, as well as use the function name for the output parameter (possible less frequently--i.e. heat_index vs. relative_humidity_from_mixing_ratio) Aside: why does wind_chill take speed and not wind_speed? 🤦‍♂️
  • Breadth-First Search (BFS) through the "graph" of functions. This isn't really a full graph as such, as the path at each node really depends on what's next needed (and nothing out there like e.g. NetworkX seemed to make this easier). I was actually quite happy with how that came together--owing in all seriousness to so many Advent of Code problems.
  • Trickier is calling each of the needed function, automatically mapping the fields in the dataset to the function parameters. Weird naming, case senstivity, xarray coords vs. variables make this more complicated than I'd like.

I've also already had to fix heat_index to allow broadcasting so that when we calculate e.g. relative_humidity using 1D isobaric, things flow through the whole pipeline fine. I'm sure there's more waiting--I already tried and failed to do equivalent_potential_temperature.

Left to do:

  • [ ] Documentation and including this in the xarray/declarative materials
  • [ ] More heuristics and options for handling naming of fields. Right now this is kind of a crap shoot
  • [ ] Should we expose this through the xarray accessor?
  • [ ] Many more fixes for xarray indexing/broadcasting (e.g. #2069)
  • [ ] Should we try to be efficient and do indexing in declarative before the calculation?
  • [ ] More tests (like in declarative)

Checklist

  • [x] Closes #3
  • [x] Tests added
  • [ ] Fully documented

dopplershift avatar Aug 31 '21 05:08 dopplershift

If you want to see what this looks like currently:

import xarray as xr
from metpy.cbook import get_test_data
from metpy.plots import ContourPlot, ImagePlot, MapPanel, PanelContainer
from metpy.units import units

narr = xr.open_dataset(get_test_data('narr_example.nc', as_file_obj=False))

contour = ContourPlot()
contour.data = narr
contour.field = 'heat_index'
contour.level = 1000 * units.hPa
contour.linecolor = 'red'
contour.contours = 15

panel = MapPanel()
panel.area = 'us'
panel.layers = ['coastline', 'borders', 'states', 'rivers', 'ocean', 'land']
panel.plots = [contour]

pc = PanelContainer()
pc.size = (10, 8)
pc.panels = [panel]
pc.show()
image

dopplershift avatar Aug 31 '21 05:08 dopplershift

:tada: :tada: :tada:

Definitely +1 on exposing through the Dataset accessor.


Given that this implementation is centered around parameter names, there are a few potentially problematic use cases that I wanted to get your thoughts on. I have no expectation that these are handled right away (if at all), but I wanted to bring them up now so that 1) we don't bake-in any assumptions we'd regret later and 2) I can get on the same page in regards to what is in-scope and out-of-scope for the solver.

  • CAPE (and other profile calculations)
    • e.g., what do we do if a CF string of "atmosphere_convective_available_potential_energy" is given?
    • how are parcel options specified/defaults chosen?
    • how to enforce the need for data with a vertical dimension?
  • divergence (and other "generic" calculations)
    • a few example CF strings that'd be good to handle
      • "divergence_of_wind"
      • "derivative_of_air_temperature_wrt_time"
    • how to construct these generic things when CF naming doesn't provide guidance?
      • advection of some quantity?
      • Laplacian of some quantity?
      • Q-Vector divergence?
  • fields that need more input than just other fields
    • optional arguments (like Q-Vector's static_stability)
    • depth argument (e.g., "0-3 km Storm Relative Helicity")
  • multiple (or conflicting) variables of same (or similar) physical quantity
    • how to handle if on different levels (e.g., Temperature_isobaric, Temperature_sigma, Temperature_tropopause)
    • how to handle height (which is often the parameter name) vs. geopotential height (which is often what we mean by height, except in the geopotential conversion functions)?

jthielen avatar Aug 31 '21 19:08 jthielen