opm-common icon indicating copy to clipboard operation
opm-common copied to clipboard

Add specialized binary search skipping boundary check

Open multitalentloes opened this issue 10 months ago • 9 comments

Small adjustment that skips checking if the indices are within the range because this is already checked in the outer function. This function is very often called, so might give a small improvement.

multitalentloes avatar Apr 23 '24 12:04 multitalentloes

std::upper_bound() seems to be slower on my machine benchmarking a couple of runs on norne... Edit: To be a bit more precise, the property evaluation on single core norne on my machine took 75 sec with the original code, 72sec with this small change, and 77 sec using std::upper_bound to binary search and std::distance to get the index

multitalentloes avatar Apr 23 '24 13:04 multitalentloes

benchmark please

akva2 avatar Apr 24 '24 07:04 akva2

The measurements are probably within what can be explained by noise, so hopefully then benchmark can provide some consistent results

multitalentloes avatar Apr 24 '24 08:04 multitalentloes

Benchmark result overview:

Test Configuration Relative
opm-git OPM Benchmark: drogon - Threads: 1 - FOPT (Total Oil Production At End Of Run) 1
opm-git OPM Benchmark: drogon - Threads: 8 - FOPT (Total Oil Production At End Of Run) 1
opm-git OPM Benchmark: punqs3 - Threads: 1 - FOPT (Total Oil Production At End Of Run) 1
opm-git OPM Benchmark: punqs3 - Threads: 8 - FOPT (Total Oil Production At End Of Run) 1
opm-git OPM Benchmark: smeaheia - Threads: 1 - FOPT (Total Oil Production At End Of Run) 1
opm-git OPM Benchmark: smeaheia - Threads: 8 - FOPT (Total Oil Production At End Of Run) 1
opm-git OPM Benchmark: spe10_model_1 - Threads: 1 - FOPT (Total Oil Production At End Of Run) 1
opm-git OPM Benchmark: spe10_model_1 - Threads: 8 - FOPT (Total Oil Production At End Of Run) 1
opm-git OPM Benchmark: flow_mpi_extra - Threads: 1 - FOIT (Total Oil Injection At End Of Run) 1
opm-git OPM Benchmark: flow_mpi_extra - Threads: 8 - FOIT (Total Oil Injection At End Of Run) 1
opm-git OPM Benchmark: flow_mpi_norne - Threads: 1 - FOPT (Total Oil Production At End Of Run) 1
opm-git OPM Benchmark: flow_mpi_norne - Threads: 8 - FOPT (Total Oil Production At End Of Run) 1
opm-git OPM Benchmark: flow_mpi_norne_4c_msw - Threads: 1 - FOPT (Total Oil Production At End Of Run) 1
opm-git OPM Benchmark: flow_mpi_norne_4c_msw - Threads: 8 - FOPT (Total Oil Production At End Of Run) 1
  • Speed-up = Total time master / Total time pull request. Above 1.0 is an improvement. *

View result details @ https://www.ytelses.com/opm/?page=result&id=2448

ytelses avatar Apr 24 '24 22:04 ytelses

Benchmark result overview:

FOPT (Total Oil Production At End Of Run) = 1

That's good to know, but alas not particularly useful from a CPU performance point of view. @blattms: Is anything up with the benchmark test rig at the moment?

bska avatar Apr 25 '24 07:04 bska

That was a bit underwhelming, but how are these times measured since there is absolutely no deviation on any case?

multitalentloes avatar Apr 25 '24 07:04 multitalentloes

notice that those are not even times. benchmarks either didn't execute properly, or it only reported fluid statistics back.

akva2 avatar Apr 25 '24 07:04 akva2

but how are these times measured since there is absolutely no deviation on any case?

notice that those are not even times.

@akva2 is correct–the FOPT measure isn't a time at all. Rather it's a (crude) measure of solution accuracy. If FOPT differs from one here, then we didn't compute the same solution as the reference case and it in some sense doesn't really matter what performance improvement we get from the PR under test.

bska avatar Apr 25 '24 08:04 bska

Can we get some clarifications as of what happened to the performance benchmark @blattms

multitalentloes avatar Apr 29 '24 11:04 multitalentloes

benchmark please

akva2 avatar May 14 '24 08:05 akva2

Benchmark result overview:

Test Configuration Relative
opm-git OPM Benchmark: drogon - Threads: 1 0.997
opm-git OPM Benchmark: drogon - Threads: 8 0.993
opm-git OPM Benchmark: punqs3 - Threads: 1 0.978
opm-git OPM Benchmark: punqs3 - Threads: 8 1.009
opm-git OPM Benchmark: smeaheia - Threads: 1 0.973
opm-git OPM Benchmark: smeaheia - Threads: 8 1
opm-git OPM Benchmark: spe10_model_1 - Threads: 1 1.019
opm-git OPM Benchmark: spe10_model_1 - Threads: 8 0.997
opm-git OPM Benchmark: flow_mpi_extra - Threads: 1 0.989
opm-git OPM Benchmark: flow_mpi_extra - Threads: 8 0.985
opm-git OPM Benchmark: flow_mpi_norne - Threads: 1 0.984
opm-git OPM Benchmark: flow_mpi_norne - Threads: 8 0.995
opm-git OPM Benchmark: flow_mpi_norne_4c_msw - Threads: 1 1.008
opm-git OPM Benchmark: flow_mpi_norne_4c_msw - Threads: 8 1.001
  • Speed-up = Total time master / Total time pull request. Above 1.0 is an improvement. *

View result details @ https://www.ytelses.com/opm/?page=result&id=2470

ytelses avatar May 14 '24 15:05 ytelses

My local measurements turned out to be a bit deceiving, and I dont really think the effect of skipping this if statement is measureable/significant with this level of noise. Should I close the PR?

multitalentloes avatar May 15 '24 08:05 multitalentloes