opm-models Modify Appleyard chopping

Restrict update of rs/rv, rsw/rvw and zfraction in the extended blackoil model by the saturation scaling factor from the Appleyard-chopping.

This is part 1 of https://github.com/OPM/opm-models/pull/803

Jun 02 '23 11:06 totto82

jenkins build this please

Jun 02 '23 11:06 totto82

benchmark please

Jun 05 '23 06:06 totto82

Black is PR. Green is reference results.

Jun 05 '23 10:06 totto82

FWIR: Summary file
	Fails for 3 entries
	Largest absolute error: 2.9583301e+02
	Largest relative error: 1.7041231e-01

Black is PR. Green is reference results.

Jun 05 '23 10:06 totto82

GGPI:INJE: Summary file
	Fails for 2 entries
	Largest absolute error: 1.9626675e+08
	Largest relative error: 2.3400146e-01

WWPR:PROD3: Summary file
	Fails for 1 entries
	Largest absolute error: 2.2281599e-01
	Largest relative error: 7.5347273e-02

Jun 05 '23 10:06 totto82

WWPGR:PROD1: Summary file
	Fails for 1 entries
	Largest absolute error: 1.2884613e+02
	Largest relative error: 1.1923457e-01

Jun 05 '23 10:06 totto82

WWPR:PROD3: Summary file
	Fails for 6 entries
	Largest absolute error: 5.3414059e-01
	Largest relative error: 1.0045919e-01

Jun 05 '23 10:06 totto82

WLPR:OPU02: Summary file
	Fails for 4 entries
	Largest absolute error: 1.0356105e+02
	Largest relative error: 4.8567018e-01

WWIR:WIL01: Summary file
	Fails for 3 entries
	Largest absolute error: 3.1060254e+02
	Largest relative error: 7.2784564e-02

Jun 05 '23 11:06 totto82

WWPR:PROD3: Summary file
	Fails for 4 entries
	Largest absolute error: 2.0443020e+00
	Largest relative error: 1.0837778e-01

WGIP:INJ1: Summary file
	Fails for 3 entries
	Largest absolute error: 1.3314957e+08
	Largest relative error: 1.8527693e-01

Jun 05 '23 13:06 totto82

WWPR:PROD3: Summary file
	Fails for 1 entries
	Largest absolute error: 1.1698884e-01
	Largest relative error: 2.0989576e-01

Jun 05 '23 13:06 totto82

WWPR:PROD3: Summary file
	Fails for 1 entries
	Largest absolute error: 1.1698884e-01
	Largest relative error: 2.0989576e-01

Jun 05 '23 13:06 totto82

WWPR:PROD1: Summary file
	Fails for 1 entries
	Largest absolute error: 1.2119231e+00
	Largest relative error: 1.9043798e-01

Jun 05 '23 13:06 totto82

WWPR:PROD1: Summary file
	Fails for 2 entries
	Largest absolute error: 1.6109295e+00
	Largest relative error: 1.0339842e-01

WBHP:INJ1: Summary file
	Fails for 1 entries
	Largest absolute error: 2.6048584e+01
	Largest relative error: 7.7440490e-02

Jun 05 '23 13:06 totto82

WBHP:INJ1: Summary file
	Fails for 1 entries
	Largest absolute error: 4.0556641e+01
	Largest relative error: 9.2056089e-02

Jun 05 '23 13:06 totto82

WTHP:PROD1: Summary file
	Fails for 1 entries
	Largest absolute error: 6.5482483e+00
	Largest relative error: 6.0313764e-02

WWPR:PROD1: Summary file
	Fails for 2 entries
	Largest absolute error: 8.0917358e+01
	Largest relative error: 2.1558314e-01

Jun 05 '23 14:06 totto82

WOPR:PROD3: Summary file
	Fails for 1 entries
	Largest absolute error: 4.6544556e+01
	Largest relative error: 1.1191940e-01

Jun 05 '23 14:06 totto82

WOPR:OP_1: Summary file
	Fails for 1 entries
	Largest absolute error: 3.7582776e+00
	Largest relative error: 9.9941726e-01

Jun 05 '23 14:06 totto82

WWIR:WI_1: Summary file
	Fails for 2 entries
	Largest absolute error: 1.3175140e+02
	Largest relative error: 7.6847942e-01

Jun 05 '23 14:06 totto82

WOPR:OP_1: Summary file
	Fails for 2 entries
	Largest absolute error: 1.8420239e+02
	Largest relative error: 1.9601552e-01

Jun 05 '23 14:06 totto82

WOPR:OP_1: Summary file
	Fails for 4 entries
	Largest absolute error: 2.2667194e+02
	Largest relative error: 3.0097662e-01

Jun 05 '23 14:06 totto82

WTHP:PROD3: Summary file
	Fails for 9 entries
	Largest absolute error: 2.4072021e+01
	Largest relative error: 5.8558714e-01

WOPR:PROD3: Summary file
	Fails for 1 entries
	Largest absolute error: 1.9500918e+02
	Largest relative error: 7.5801425e-01

Jun 05 '23 14:06 totto82

I have manually gone through the test failures and plotted significant deviations. Cases not shown have only minor changes. The significant differences points to changes in time-stepping that again affects the results. Some more testing on field models is needed before concluding if these changes improves the stability of the newton update, but the test models are ok IMO.

Jun 05 '23 14:06 totto82

I think it would be good to automate the process to evaluate the test failures. Currently this involves significant manual work. What I have done this time is to go through the test failures and plot the worst offending vectors using qsummary. Like

~/workspace/opm/qsummary/build/qsummary \
    flow+udq_wconprod/UDQ_WCONPROD. \
    ~/workspace/opm/opm-tests/udq_actionx/opm-simulation-reference/flow/UDQ_WCONPROD. \
    -v WLPR:OPU02

@akva2 What do you think? Could the current test infrastructure be extended with such an workflow?

Jun 05 '23 14:06 totto82

Benchmark result overview:

Test	Configuration	Relative
opm-git	OPM Benchmark: drogon - Threads: 1	0.979
opm-git	OPM Benchmark: drogon - Threads: 8	0.617
opm-git	OPM Benchmark: smeaheia - Threads: 1	1
opm-git	OPM Benchmark: smeaheia - Threads: 8	1
opm-git	OPM Benchmark: spe10_model_1 - Threads: 1	0.991
opm-git	OPM Benchmark: spe10_model_1 - Threads: 8	1.001
opm-git	OPM Benchmark: flow_mpi_extra - Threads: 1	1.06
opm-git	OPM Benchmark: flow_mpi_extra - Threads: 8	0.945
opm-git	OPM Benchmark: flow_mpi_norne - Threads: 1	1.016
opm-git	OPM Benchmark: flow_mpi_norne - Threads: 8	0.975

Speed-up = Total time master / Total time pull request. Above 1.0 is an improvement. *

View result details @ https://www.ytelses.com/opm/?page=result&id=2114

Jun 05 '23 18:06 ytelses

I think it would be good to automate the process to evaluate the test failures. Currently this involves significant manual work.

Yes, indeed. Manually going through all the test failures (very often, it is tens of them) is significant work. When time stepping changes, all the current jenkins comparison basically not sensible anymore. If the jenkins can help to plot all the relevant plots out, it will be a big step towards the right direction.

Jun 06 '23 06:06 GitPaean

I looked a bit more in details on the benchmark results for drogon. With CPRW the performance is similar for this PR and master.

Jun 06 '23 07:06 totto82

It was suggested by @hnil to use fixed timesteps with no chopping for all feature tests, to avoid this quagmire. I think that is a good idea that could avoid a lot of extra work with tests that seemingly fail. An alternative that may be a bit weaker would be to only compare the solutions at the report steps.

For this concrete PR, I very much want to say "go ahead" but the changes are large enough to make me a little nervous. Maybe for one or a few of the "most failing" cases you could take a reference run, extract the timesteps, then run using the PR with the fixed steps, and see if the difference is then significant? (If not, then the difference was only caused by different timestepping and we are good to go.)

Jun 07 '23 11:06 atgeirr

Maybe for one or a few of the "most failing" cases you could take a reference run,

The difficulty is that how to determine most failing is not easy to tell. We might oversee the real problems, for example, a bug. Then the purpose of jenkins regression test is defeated to a large extent.

Jun 07 '23 11:06 GitPaean

The difficulty is that how to determine most failing is not easy to tell. We might oversee the real problems, for example, a bug. Then the purpose of jenkins regression test is defeated to a large extent.

I agree, and I did not intend this as a new general procedure, only as a way to assess the test failures here a bit better, at the same time seeing if the idea of fixing timesteps is workable.

Jun 07 '23 14:06 atgeirr

jenkins build this please

Aug 07 '23 12:08 totto82