opm-common
opm-common copied to clipboard
Add specialized binary search skipping boundary check
Small adjustment that skips checking if the indices are within the range because this is already checked in the outer function. This function is very often called, so might give a small improvement.
std::upper_bound() seems to be slower on my machine benchmarking a couple of runs on norne... Edit: To be a bit more precise, the property evaluation on single core norne on my machine took 75 sec with the original code, 72sec with this small change, and 77 sec using std::upper_bound to binary search and std::distance to get the index
benchmark please
The measurements are probably within what can be explained by noise, so hopefully then benchmark can provide some consistent results
Benchmark result overview:
Test | Configuration | Relative |
---|---|---|
opm-git | OPM Benchmark: drogon - Threads: 1 - FOPT (Total Oil Production At End Of Run) | 1 |
opm-git | OPM Benchmark: drogon - Threads: 8 - FOPT (Total Oil Production At End Of Run) | 1 |
opm-git | OPM Benchmark: punqs3 - Threads: 1 - FOPT (Total Oil Production At End Of Run) | 1 |
opm-git | OPM Benchmark: punqs3 - Threads: 8 - FOPT (Total Oil Production At End Of Run) | 1 |
opm-git | OPM Benchmark: smeaheia - Threads: 1 - FOPT (Total Oil Production At End Of Run) | 1 |
opm-git | OPM Benchmark: smeaheia - Threads: 8 - FOPT (Total Oil Production At End Of Run) | 1 |
opm-git | OPM Benchmark: spe10_model_1 - Threads: 1 - FOPT (Total Oil Production At End Of Run) | 1 |
opm-git | OPM Benchmark: spe10_model_1 - Threads: 8 - FOPT (Total Oil Production At End Of Run) | 1 |
opm-git | OPM Benchmark: flow_mpi_extra - Threads: 1 - FOIT (Total Oil Injection At End Of Run) | 1 |
opm-git | OPM Benchmark: flow_mpi_extra - Threads: 8 - FOIT (Total Oil Injection At End Of Run) | 1 |
opm-git | OPM Benchmark: flow_mpi_norne - Threads: 1 - FOPT (Total Oil Production At End Of Run) | 1 |
opm-git | OPM Benchmark: flow_mpi_norne - Threads: 8 - FOPT (Total Oil Production At End Of Run) | 1 |
opm-git | OPM Benchmark: flow_mpi_norne_4c_msw - Threads: 1 - FOPT (Total Oil Production At End Of Run) | 1 |
opm-git | OPM Benchmark: flow_mpi_norne_4c_msw - Threads: 8 - FOPT (Total Oil Production At End Of Run) | 1 |
- Speed-up = Total time master / Total time pull request. Above 1.0 is an improvement. *
View result details @ https://www.ytelses.com/opm/?page=result&id=2448
Benchmark result overview:
FOPT (Total Oil Production At End Of Run) = 1
That's good to know, but alas not particularly useful from a CPU performance point of view. @blattms: Is anything up with the benchmark test rig at the moment?
That was a bit underwhelming, but how are these times measured since there is absolutely no deviation on any case?
notice that those are not even times. benchmarks either didn't execute properly, or it only reported fluid statistics back.
but how are these times measured since there is absolutely no deviation on any case?
notice that those are not even times.
@akva2 is correct–the FOPT
measure isn't a time at all. Rather it's a (crude) measure of solution accuracy. If FOPT
differs from one here, then we didn't compute the same solution as the reference case and it in some sense doesn't really matter what performance improvement we get from the PR under test.
Can we get some clarifications as of what happened to the performance benchmark @blattms
benchmark please
Benchmark result overview:
Test | Configuration | Relative |
---|---|---|
opm-git | OPM Benchmark: drogon - Threads: 1 | 0.997 |
opm-git | OPM Benchmark: drogon - Threads: 8 | 0.993 |
opm-git | OPM Benchmark: punqs3 - Threads: 1 | 0.978 |
opm-git | OPM Benchmark: punqs3 - Threads: 8 | 1.009 |
opm-git | OPM Benchmark: smeaheia - Threads: 1 | 0.973 |
opm-git | OPM Benchmark: smeaheia - Threads: 8 | 1 |
opm-git | OPM Benchmark: spe10_model_1 - Threads: 1 | 1.019 |
opm-git | OPM Benchmark: spe10_model_1 - Threads: 8 | 0.997 |
opm-git | OPM Benchmark: flow_mpi_extra - Threads: 1 | 0.989 |
opm-git | OPM Benchmark: flow_mpi_extra - Threads: 8 | 0.985 |
opm-git | OPM Benchmark: flow_mpi_norne - Threads: 1 | 0.984 |
opm-git | OPM Benchmark: flow_mpi_norne - Threads: 8 | 0.995 |
opm-git | OPM Benchmark: flow_mpi_norne_4c_msw - Threads: 1 | 1.008 |
opm-git | OPM Benchmark: flow_mpi_norne_4c_msw - Threads: 8 | 1.001 |
- Speed-up = Total time master / Total time pull request. Above 1.0 is an improvement. *
View result details @ https://www.ytelses.com/opm/?page=result&id=2470
My local measurements turned out to be a bit deceiving, and I dont really think the effect of skipping this if statement is measureable/significant with this level of noise. Should I close the PR?