Everest run terminates unexpectedly

Open tup1985 opened this issue 1 month ago • 1 comments

The following setup fails consistently and I am either missing the knowledge to understand why or the actual reason is not being captured/propagated clearly.

Run produced with: everest:16.0.13, ropt:0.24.0, ert:16.0.13

The optimization parameters are: (only to show that this should have moved forward with 50% of the successful simulations, and as long as batch number is less than 10)

optimization:
  min_realizations_success: 50
  max_batch_num: 10
    perturbation_num: 2
  auto_scale: True
  options:
    - "max_step = 0.2"

In batch 2, for one of the realizations, one of the perturbations fails due to convergence issues in the simulator. And this is the only thing failing from the entire ensemble (evaluations and perturbations) for the current batch.

===================== Running forward models (Batch #2) ======================

  Waiting: 0 | Pending: 0 | Running: 0 | Finished: 19 | Failed: 1

               symlink: 0/20/0 | Finished: 0-19
        copy_directory: 0/20/0 | Finished: 0-19
               symlink: 0/20/0 | Finished: 0-19
             copy_file: 0/20/0 | Finished: 0-19
      well_constraints: 0/20/0 | Finished: 0-19
         add_templates: 0/20/0 | Finished: 0-19
              schmerge: 0/20/0 | Finished: 0-19
                  flow: 0/19/1 | Finished: 0-3, 5-19 | Failed: 4
                  flow: Failed: Process exited with status code 255, realizations: 4
  extract_summary_data: 0/19/0 | Finished: 0-3, 5-19
  extract_summary_data: 0/19/0 | Finished: 0-3, 5-19


Everest run failed with: Optimization failed: not enough successful realizations to proceed.
flow Failed with: Process exited with status code 255

Based on what I defined in the optimization section, there are enough realizations to proceed, so the Everest failure message is not clear to me. I have not defined a min_pert_success criteria, so maybe this gets triggered somehow, but i get no hint of this in any of the logs or terminal output.

Since I get no error from the optimizer for this batch (dakota files look fine, as optimization part is not even reached in this batch), the only place to look for the actual reason in Everest logs, but there i don't get more than the failed simulation for realization 4

2025-12-03 15:10:29,261 everest ERROR: Simulation 4 failed.

Lastly, regarding the failed realization 4, all Everest logs refer to realization 4, but this indexing is not helpful either as it doesn't talk to any realization/evaluation/perturbation index that I specified in the config, or I can find in runpath. Probably this is subject for a separate issue, but the most helpful prompt would be with the actual realization number and not the converted index based on how many perturbations to realizations I have defined.

Dec 04 '25 12:12 tup1985

Reran the case with min_pert_success and execution continues past the failed simulation.

If min_pert_success is not defined, then a default other than NULL should be in place. Or a message/warning/log should be generated to clearly state this.

I think, in the past, we completely bypassed this by having speculative: True at all times. I will mention is not completely straightforward what are the implications of having/not having all of these options.

Dec 04 '25 13:12 tup1985