climt Add support for additional tracers to Emanuel Convection

What the title says. This should be done by taking in specifications of arbitrary tracers to be advected at initialization time, which get added into the properties dictionaries. These can be packed into whatever kind of array the Fortran code requires.

Jun 11 '18 18:06 mcgibbon

Actually, this is an issue that is more general than just the emanuel scheme. Even when we are not using this component, it is desirable to be able to specify tracers in some uniform way, which could be used by an advection scheme, this convection scheme or even the dynamical core for that matter. I haven't thought about it for some time, but it was not clear how one would specify arbitrary tracers within climt or sympl...

Jun 11 '18 19:06 JoyMonteiro

My understanding is that we need to supply some array with a "tracer number" dimension to the code. We can create a TracerPacker object (maybe by another name) which does what we need and can be used by components if they desire. It would need to have the following methods:

register_tracers(dims, unit_dict, input_properties) where dims is a dim specification the tracers will be fit to, such as ['*', 'mid_levels'], unit_dict is a mapping from quantity names to dimensions, and input_properties is the property dictionary on a component to which it will insert properties for these tracers. This method would also internally store the tracer names, to be used by pack_tracers and unpack_tracers below. This would be called within a component's __init__.
pack_tracers(raw_state) which takes in the state of numpy arrays (such as given to array_call) and returns an array of dimension ['tracer_number'] + tracer_dims containing the tracers.
unpack_tracers(tracer_array) which takes a tracer array of the form returned by pack_tracers and returns a dictionary of numpy arrays (one for each tracer)

When multiple components are handling tracers, all the user has to do is pass in the same tracer unit dictionary to multiple components. dims would be determined by the component (a dynamical core for instance would specify horizontal dimensions, while a convective component would not).

This could potentially be used for other components whose inputs are unknown until you use them, like components that should relax a set of desired quantities towards zero or smooth them in the vertical or ensure they're conserved from timestep to timestep. For this reason, perhaps we should name it QuantityPacker or something similar.

Jun 11 '18 19:06 mcgibbon

If you like the above design, we should put such an object in sympl to be used by climt.

Jun 11 '18 19:06 mcgibbon

This solution adds a layer of complexity that does not seem necessary to me. The way I see it, tracers should be just another quantity in state. Maybe they should have a special name. The reason I say this is that having components "register" tracers is a violation of separation of concerns -- there is no reason for any component to know about tracers a priori.

alternatively, within array_call, it should check to see if the dimension of the tracers quantity is non-zero, and if they exist, do something with them.

If we write it this way, adding tracers will then be a responsibility of get_default_state which will need to be extended to handle this use case.

Jun 13 '18 13:06 JoyMonteiro

I agree with what I think is the crux of your issue with this proposal, and suggest a change that fixes it in the last paragraph. For the most part my proposal is the same.

There are a few reasons for components to know about tracers a priori:

The component knows whether it handles tracers or not when you write it. Having the component expose interfaces to deal with tracers tells you about this.
We want tracers to be just any other quantity in state. This means components must request the tracer quantities individually by name in their input properties, and list them in their output or tendency properties. This requires telling the component about the tracer names and units (along with the dimensions it applies equally to all tracers).
Different components may have different quantities that they consider to be tracers. For a specific example, a higher-order turbulence closure will not consider higher-order turbulent moments to be tracers, but may want its higher-order turbulent moments to be considered tracers by the horizontal advection scheme. This makes it impossible to have a single array of tracers that is used by all components.

In order to satisfy separation of concerns, the above has the component handle only its own responsibilities (exposing the correct input and output properties), and passes on all the actual work for making and unpacking tracer arrays to this new object. It is specifically because of separation of concerns that there must be a new component here - we have a new responsibility, which is combining quantity arrays into tracer arrays and vice-versa, so we need a TracerPacker.

We can't have tracers be their own quantity in the state mainly because the set of tracers is not guaranteed to be universal across components. Some components will want them as normal inputs. This requires that the State be a lot more complicated than a dict, as it has to pack and unpack tracers. And then we still have the problem that each component needs to know which tracers it's actually handling as tracers.

We could put the tracer handling functionality into the base classes in some way. For example, we could have a property allows_tracers which if True will result in the base call and init methods doing all this tracer packing and unpacking stuff, so that array_call just sees a quantity "tracers". Then when you're actually writing components, the amount of boilerplate code related to tracers is limited to one line (maybe two since it also needs to specify the dims of the tracers).

Now, the part I don't like about the current proposal is having to pass in the tracer names to each component. It seems like a lot of configuration boilerplate for the user. Perhaps it would be best if there is a central registry of tracers. If you remove any quantities that a component uses as normal inputs, you have the list of tracer names specific to that component. The base classes could get that specific list of tracer names and handle packing and unpacking. It could also allow the user to override this default list using explicitly passed names as previously proposed (for if they're doing something wonky). I'd have the base classes do this within specific methods, so that component writers can easily overwrite them if they so wish. This doesn't work for the other way of doing things (having a "tracer" quantity in the state) because the set of tracers used for each component is still unique. This way still allows components to explicitly list tracer quantity names in their input and output properties.

How does that sound?

Jun 13 '18 15:06 mcgibbon

So, let me know if I'm understanding correctly. Let's take a pollution model, which is a nice use case for tracers.

A component PollutantSource will register say SO_2, CO, PM_2.5 as tracers in a central registry, with default values and units specified. This will be done in init.
get_default_state will create quantities with these names in state, using the grid information that will be available later on.
A radiation component which has allows_tracers = False will get these quantities as inputs if it requires them for radiation calculations
a dynamics component which has allows_tracers = True will get these quantities as a packed array with appropriate dimensions. (What happens if it also requires some of them as inputs?)
dynamics will advect these tracers, and then they get unpacked into their respective state quantities.

I think this would work. But in this scenario, I would think helper functions which get called in call before and after array_call would suffice, rather than a TracerPacker object. What do you think?

Jun 14 '18 09:06 JoyMonteiro

Good example!

get_default_state will automagically create these quantities in the state, since this method will put the tracers as new inputs on components.

The reason I would put these methods into a TracerPacker is separation of concerns - this is a really nice case of composition. It's similar to why I have separate classes/objects that handle validation of each of the types of properties, instead of putting them as methods on the base classes. This is a clear responsibility with a specific amount of data required to carry it out, and works well as a new object. This could be used only internally in the base classes and never exposed to the user, potentially within helper methods as you're asking for (especially if we want to allow external writers to customize the packing and unpacking methods).

My current understanding is that a dynamics component would never both require a quantity as an input and use it as a tracer. If that's the case, we should add a new property flag that would indicate this (maybe tracer which by default is False). By default, quantities present in inputs (not indicated as tracers) that are registered as tracers would not be packed into the tracer array for the object that has it as an input.

Jun 14 '18 15:06 mcgibbon

I need some input about the tracer-also-as-input thing. Can I assume for now that almost every component (~90% or more) will not use any tracer both as an input and as a tracer-packed quantity? For the case that a component does, I'll make sure there's a way to override the way packing is done on that component. It complicates the code a lot to do this in a general way for components of that kind, and would be a lot cleaner to do on those individual components which know the specifics of that component (for example, the quantity names).

Jun 16 '18 00:06 mcgibbon

I think we should ignore the use case where tracers are both inputs and tracers.

I'm still unclear what the ideal workflow with tracers should be:

Should components register tracers, or should the user? IMO, it makes more sense for the user to do so, since PollutantSource should not make assumptions about what the user wants to do with its outputs
How will tracers show up as inputs for components with allows_tracers=True? is this being done in the __init__ of the base class? if so, should every subclass call the super's __init__?

If the responsibilities of the tracer object is to facilitate registry, removal, packing and unpacking of tracers, maybe it should be named TracerManager rather than Packer...

Jun 16 '18 09:06 JoyMonteiro

Both components and the user should register tracers. The user should be able to see and remove tracers that have been added by components. This will require some design thinking on my part because only components being used should register tracers, but also we want registration to result in updating all tracer handling components. I think this can be done, and will post details later.

Every subclass must call the superclass init in new Sympl or it will error when you call it. Modifying the init will happen within the tracer code at init time and whenever tracers are registered or removed.

Registration will probably be handled by helper functions, and the packers will only pack and unpack. Whenever you want to name an object Manager think carefully about whether it has too many responsibilities. I'll think about this as I write more of the code.

Jun 16 '18 17:06 mcgibbon

Basically there will be some central registry of components that have been created which use tracers, and whenever the tracers are modified that registry would be used to modify all of those components' properties. Also whenever a component is initialized it should register any tracers it wants to register.

Jun 16 '18 17:06 mcgibbon

Of course, whether a component should register tracers or not is up to the component. I won't be putting anything in Sympl or the base classes that forces a component to do so. The main difficulty with having a user register all tracers is that these names won't be easy for the user to gather and put in one place. Having the component register a tracer is like saying it should be a tracer "by default", and the user is allowed to remove it as a tracer after. I think that would make sense for a PollutantSource.

Jun 16 '18 23:06 mcgibbon

While tracer support has been added to Sympl, we still haven't added tracers to Emanuel convection.

Aug 20 '18 17:08 mcgibbon

climt climt copied to clipboard

Add support for additional tracers to Emanuel Convection

climt
climt copied to clipboard