ert icon indicating copy to clipboard operation
ert copied to clipboard

Validate that DESIGN_MATRIX is correct upon initialization

Open xjules opened this issue 1 year ago • 3 comments

The validation should include checking whether the design sheet part handles NUM_REALIZATIONS correctly, which means whether the all the design values combined with default sheet will handle the total number of realizations.

  • REAL column (if provided) should contain iens; which yield active realizations and thus filling the active realizations edit box.
  • in default sheet if the parameter is already present in the design sheet the default sheet entry is ignored otherwise it is appended as a new column with a singular default value.
  • make sure that the parameter names are unique

~~Blocked by: https://github.com/equinor/ert/issues/8902~~

xjules avatar Sep 12 '24 11:09 xjules

Currently design2params will fail if you run more realizations than entries in design matrix. Maybe we should do the same, just ignore NUM_REALIZATIONS if you choose design matrix and use number of realizations specified in the design matrix. For instance if you only specify realization 1,4,7 in the design matrix, than it is probably expected that only realizations 1,4,7 are actually run?

larsevj avatar Sep 26 '24 13:09 larsevj

Currently design2params will fail if you run more realizations than entries in design matrix. Maybe we should do the same, just ignore NUM_REALIZATIONS if you choose design matrix and use number of realizations specified in the design matrix. For instance if you only specify realization 1,4,7 in the design matrix, than it is probably expected that only realizations 1,4,7 are actually run?

This makes sense for ensemble_experiment, but not sure if we are to use DESIGN_MATRIX in the update step. Let's focus only for the ensemble experiment for now. If we would choose a subset (active realizations) from the list of realizations in the design matrix, why would this not work?

xjules avatar Sep 30 '24 07:09 xjules

Some of the points of this issue has been covered but are still a lot of validation issues, some of them are:

  • Disallow spaces in parameter names
  • Decide if we will allow parameter names to be numbers or not
  • Handle empty input properly or misconfigured input, might rely on pandas for this. But maybe we should check what kind of error messages we receive and decide if we want to show these or display our own errors.
  • Find out what errors are dependent on each others and what are not, so that we can validate as much as possible before raising error?

larsevj avatar Oct 11 '24 08:10 larsevj

The semeio forward models accept whitespace in parameter names, so I think we should do the same (and update gen_kw to do so too). It is a very small change, changing file_line.split() ->shlex.split(file_line)

EDIT: after discussing with @oyvindeide, we concluded that we will not be changing this for gen_kw, and we will not support it in design matrix either. We are aware that this makes design_matrix differ from design2params, and if it is crucial for some users, we will add it as a feature down the line.

jonathan-eq avatar Jan 30 '25 06:01 jonathan-eq