urbansim_templates icon indicating copy to clipboard operation
urbansim_templates copied to clipboard

Primary outputs of model steps vs. derived variables and other secondary outputs

Open smmaurer opened this issue 6 years ago • 0 comments

This issue is to lay out a strategy for handling primary outputs of model steps vs. derived variables, vs. potential secondary outputs that are not derived variables. Tagging @mxndrwgrdnr and @janowicz in case you're interested.

Background

Current templates are designed around the idea that when a model step runs, it produces a single Orca column (pd.Series) of primary output: predicted prices, predicted choices, etc.

Sometimes there are additional relevant outputs. For example, when we allocate households or employers to buildings that have capacity constraints, the primary output is the agents' choice of buildings. The available capacity in the buildings also changes, but the template does not currently update that column.

This works out because the capacity is a derived variable that can be calculated from other data. A common pattern is to define the capacity column as a callable -- with the correct Orca cache settings, the capacities will be recalculated as needed.

Advantages of the status quo

  • it supports existing use cases well
  • it's nice to have a distinction between the primary output of a model step and derived variables
  • it's nice to keep the output of model steps as simple and consistent as possible

Create standards for secondary outputs?

Users who aren't familiar with the idioms of Orca derived variables would probably expect the model step to automatically update capacities. This would be feasible, although a little bit complicated to support all the potential combinations of constraints and Orca column types.

If secondary outputs will be common -- particularly secondary outputs that aren't derived variables -- we could put together some general functionality for this.

But so far, additional outputs tend to fall into the category of status reporting (sampled alternatives, choice probabilities, model fits) rather than info that needs to go into the core Orca representation of model state.

Conclusions

The current use pattern works, but we need to document it clearly.

We may want to support secondary outputs, but it will make the API more complicated so my inclination is to wait on it.

smmaurer avatar Nov 06 '18 21:11 smmaurer