raven
raven copied to clipboard
[UNDER-DISCUSSION] Metrics in Optimization
Under Discussion Topic
A common workflow emerging among users and advertised by RAVEN developers is model calibration or parameter tuning.
In this workflow, the following conceptual pseudo-Step is considered:
<MultiRun>
<Input> Placeholder
<Optimizer>
<Model> EnsembleModel
<Output> TargetEvaluation, Results
</MultiRun>
<EnsembleModel>
<Model> ROM (from RAVEN or otherwise)
<Model> V and V Standard (experiment, code, etc)
<Model> Metrics to compare ROM and Standard
</EnsembleModel>
However, there is no current method to use the RAVEN <Metrics> within a model that can be used in an EnsembleModel.
There is a current workaround: using RAVEN-running-RAVEN, with RAVEN as the Model, allowing a single sample of both the ROM and Standard, followed by a single-realization PostProcess step, returning the Metric result. This is unnecessarily burdensome to workflow designers.
Options to consider under discussion:
- Allow
PostProcessors(of per-realization type, not full-population type) to be included inEnsembleModelworkflows - Allow the
Optimizer(or a parent class) to directly incorporateMetricsto collapse realization data - Allow
Metricsto be imported inExternalModelsfor use within theEnsembleModel - Encourage use of the current workflow
For Change Control Board: Issue Review
This review should occur before any development is performed as a response to this issue.
- [x] 1. Is it tagged with the under_discussion type?
- [x] 2. If implemented, it will add a new requirement?
- [x] 3. Is a rationale provided? (Such as explaining why the improvement is needed )
For Change Control Board: Issue Closure
This review should occur when the issue is imminently going to be closed.
- [ ] 1. The discussion determined the addition of a new
taskissue?
2022-02-02 Design Meeting Notes
We quickly concluded that including the PostProcessor in the EnsembleModel is challenging and does not present sufficient use cases currently to justify the effort involved; aside from which, the EnsembleModel is a relatively complex user tool, so that leads to preference for the Optimizer-Metric option.
Other Benefits
The Optimizer can also be trivially used as a weighted multi-objective optimizer by introducing the inclusion of a Metric for collapsing data into a single target float.
Where to implement
We had some discussion on the correct class on which to place this Metric tool. It is clear that other feedback-enabled Samplers (i.e. all AdaptiveSamplers) can benefit from this Metric integration.
Further, it's possible that ForwardSamplers could use this Metric integration as well; however, the same result can be obtained by running a MultiRun to take all the samples at once, then send them to a PostProcess step to yield the metric results on a per-realization basis. Thus, there is no forced need to include Metric integration in the ForwardSampler; rather this would be a tool of input convenience.
Yet further, there is no feedback mechanic currently for ForwardSamplers to "tag up" with the sampled output results, so there isn't a tool for the ForwardSampler to apply any postprocessing, including Metrics. A new mechanic would have to be added similar to the TargetEvaluation, which seems complex and unnecessary for the nominal benefit.
Supporting Efforts
The initiating requests that led to this discussion are from IES in dynamic model V&V work; however, there is no developer available to support the code base changes necessary at this time.
Notes
Metrics, MetricDistributor, and PostProcessor
Apparently the only current use case for the Metrics is within a PostProcessor, and there is a helper layer called the MetricDistributer that handles complex Metric needs. The responsibilities look something like:
Metric: (a, b) -> floatMetricDistributor: handle time-dependence, multiple-recursive metric collapsingPostProcessor: currently only user access toMetrics
It's not completely clear which level should be used in the AdaptiveSampler, but probably either the MetricDistributer or the PostProcessor.
Whiteboard
The whiteboard used during the 2022-02-02 discussion is included here:
