Get column value by column label in Framework Core ASoA
I will continue our discussion here @ktf, @jgrosseo, @saganatt.
@jgrosseo was correct; this solution shouldn't be implemented as it is. It was slower in this benchmark by 3-10 times for simple examples. I will leave this draft PR here and try to implement this alternative solution I was writing about in https://github.com/AliceO2Group/O2Physics/pull/7371. That 20% I was speaking of during the presentation was the whole task performance difference locally on my laptop. These are my local benchmark results:
------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
------------------------------------------------------------------------------------------------------------
BM_ASoADynamicColumnPresent/8 237 ns 236 ns 2983783 bytes_per_second=258.465M/s
BM_ASoADynamicColumnPresent/64 395 ns 395 ns 1717027 bytes_per_second=1.20857G/s
BM_ASoADynamicColumnPresent/512 1559 ns 1555 ns 448023 bytes_per_second=2.45328G/s
BM_ASoADynamicColumnPresent/4096 10961 ns 10933 ns 62651 bytes_per_second=2.79137G/s
BM_ASoADynamicColumnPresent/32768 85382 ns 85164 ns 8024 bytes_per_second=2.86673G/s
BM_ASoADynamicColumnPresent/262144 679514 ns 677289 ns 998 bytes_per_second=2.88374G/s
BM_ASoADynamicColumnPresentGetValueByLabel/8 419 ns 418 ns 1622625 bytes_per_second=145.876M/s
BM_ASoADynamicColumnPresentGetValueByLabel/64 1772 ns 1768 ns 392634 bytes_per_second=276.255M/s
BM_ASoADynamicColumnPresentGetValueByLabel/512 12671 ns 12641 ns 54994 bytes_per_second=309.019M/s
BM_ASoADynamicColumnPresentGetValueByLabel/4096 99301 ns 99056 ns 6893 bytes_per_second=315.478M/s
BM_ASoADynamicColumnPresentGetValueByLabel/32768 805034 ns 802857 ns 858 bytes_per_second=311.388M/s
BM_ASoADynamicColumnPresentGetValueByLabel/262144 6319410 ns 6301477 ns 107 bytes_per_second=317.386M/s
BM_ASoADynamicColumnCall/8 25.3 ns 25.2 ns 28377608 bytes_per_second=2.36521G/s
BM_ASoADynamicColumnCall/64 183 ns 182 ns 3722494 bytes_per_second=2.61903G/s
BM_ASoADynamicColumnCall/512 1378 ns 1375 ns 478496 bytes_per_second=2.7751G/s
BM_ASoADynamicColumnCall/4096 11039 ns 11012 ns 62893 bytes_per_second=2.77126G/s
BM_ASoADynamicColumnCall/32768 88298 ns 88067 ns 7847 bytes_per_second=2.77221G/s
BM_ASoADynamicColumnCall/262144 706708 ns 704322 ns 996 bytes_pelementaryer_second=2.77306G/s
BM_ASoADynamicColumnCallGetValueByLabel/8 67.6 ns 67.4 ns 10002511 bytes_per_second=905.098M/s
BM_ASoADynamicColumnCallGetValueByLabel/64 500 ns 498 ns 1345915 bytes_per_second=979.614M/s
BM_ASoADynamicColumnCallGetValueByLabel/512 3981 ns 3972 ns 175994 bytes_per_second=983.554M/s
BM_ASoADynamicColumnCallGetValueByLabel/4096 31831 ns 31757 ns 22014 bytes_per_second=984.025M/s
BM_ASoADynamicColumnCallGetValueByLabel/32768 255368 ns 254706 ns 2722 bytes_per_second=981.526M/s
BM_ASoADynamicColumnCallGetValueByLabel/262144 2029059 ns 2022533 ns 340 bytes_per_second=988.859M/s
REQUEST FOR PRODUCTION RELEASES:
To request your PR to be included in production software, please add the corresponding labels called "async-
+async-label <label1>, <label2>, !<label3> ...
This will add <label1> and <label2> and removes <label3>.
The following labels are available async-2023-pbpb-apass3 async-2023-pbpb-apass4 async-2023-pp-apass4 async-2024-pp-apass1 async-2022-pp-apass7 async-2024-pp-cpass0
Thank you for your investigation. Much appreciated.
As a general note, please try to have a smaller contributions if possible, because that will for sure simplify and speedup the review process.
Previously, I used std::function as the return type for the getter, but the performance was pretty bad. I changed O2 to debug mode and used perf + speedscope to analyze it. I obtained a much better performance, almost equal to an explicit getter call (less than 10% execution time difference). Then I switched back to release mode for compilation, and these are the results:
-------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
-------------------------------------------------------------------------------------------------------------
BM_ASoADynamicColumnPresent/8 236 ns 235 ns 2951784 bytes_per_second=259.24M/s
BM_ASoADynamicColumnPresent/64 397 ns 396 ns 1765157 bytes_per_second=1.2049G/s
BM_ASoADynamicColumnPresent/512 1557 ns 1553 ns 454082 bytes_per_second=2.45591G/s
BM_ASoADynamicColumnPresent/4096 10783 ns 10758 ns 63512 bytes_per_second=2.83666G/s
BM_ASoADynamicColumnPresent/32768 85087 ns 84863 ns 8263 bytes_per_second=2.87686G/s
BM_ASoADynamicColumnPresent/262144 683696 ns 681478 ns 1030 bytes_per_second=2.86601G/s
BM_ASoADynamicColumnPresentGetGetterByLabel/8 277 ns 276 ns 2529198 bytes_per_second=221.199M/s
BM_ASoADynamicColumnPresentGetGetterByLabel/64 658 ns 656 ns 1077150 bytes_per_second=744.385M/s
BM_ASoADynamicColumnPresentGetGetterByLabel/512 3490 ns 3482 ns 199110 bytes_per_second=1121.89M/s
BM_ASoADynamicColumnPresentGetGetterByLabel/4096 27125 ns 27054 ns 26808 bytes_per_second=1.12801G/s
BM_ASoADynamicColumnPresentGetGetterByLabel/32768 208795 ns 208255 ns 3366 bytes_per_second=1.17232G/s
BM_ASoADynamicColumnPresentGetGetterByLabel/262144 1668853 ns 1663740 ns 423 bytes_per_second=1.17394G/s
BM_ASoADynamicColumnCall/8 24.8 ns 24.7 ns 28070062 bytes_per_second=2.41155G/s
BM_ASoADynamicColumnCall/64 180 ns 180 ns 3630937 bytes_per_second=2.65376G/s
BM_ASoADynamicColumnCall/512 1377 ns 1374 ns 497025 bytes_per_second=2.77621G/s
BM_ASoADynamicColumnCall/4096 10961 ns 10935 ns 63683 bytes_per_second=2.79084G/s
BM_ASoADynamicColumnCall/32768 87853 ns 87642 ns 8035 bytes_per_second=2.78565G/s
BM_ASoADynamicColumnCall/262144 703888 ns 701646 ns 982 bytes_per_second=2.78363G/s
BM_ASoADynamicColumnCallGetGetterByLabel/8 40.0 ns 39.9 ns 17621092 bytes_per_second=1.4949G/s
BM_ASoADynamicColumnCallGetGetterByLabel/64 299 ns 298 ns 2335159 bytes_per_second=1.60043G/s
BM_ASoADynamicColumnCallGetGetterByLabel/512 2400 ns 2394 ns 300728 bytes_per_second=1.59373G/s
BM_ASoADynamicColumnCallGetGetterByLabel/4096 18681 ns 18635 ns 36845 bytes_per_second=1.63765G/s
BM_ASoADynamicColumnCallGetGetterByLabel/32768 148504 ns 148136 ns 4728 bytes_per_second=1.64809G/s
BM_ASoADynamicColumnCallGetGetterByLabel/262144 1189493 ns 1185832 ns 571 bytes_per_second=1.64705G/s
I am currently stuck with this solution; I would like to know if using the get() explicitly column struct member method is possible. I was trying to implement such an approach, but there is a type problem; the member method pointer type is ex. float(*ColumnType)(const RowItType&), and I would know ColumnType only in runtime after string comparison, so I can't think of an efficient implementation.
I would like to know if you think such a performance difference would be acceptable and if you can think of a more efficient solution.
@ktf @jgrosseo, May you have a look at the latest changes?
@mytkom thank you for your work. See inline comments. I might have more once those are resolved.
@ktf I have finally made the changes requested in your last review. Could you take a look at the current implementation?
@ktf
I simplified this using nullptr as you advised; now I wonder if this strncmp in my implementation is not making it overcomplicated for such a comparison. Could you take a look and give me your opinion on this? Maybe using string + strcmp for str.data() instead of string_view would be more readable and performance drawback should not be so heavy.
Error while checking build/O2/fullCI for 6e55f8da6a8592a4f7cf9af6cb10c5a5b86de16f at 2024-10-21 18:16:
## sw/BUILD/xjalienfs-latest/log
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
## sw/BUILD/O2-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
## sw/BUILD/O2-sim-challenge-test-latest/log
./digi.log[ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/37}
./digi.log[ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/38}
./digi.log[ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/40}
./digi.log[ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/42}
./digi.log[ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/43}
./digi.log[ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/45}
./digi.log[ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/46}
./digi.log[ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/47}
./digi.log[ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/48}
./digi.log[ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/49}
./digi.log[9304:internal-dpl-ccdb-backend]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/37}
./digi.log[9304:internal-dpl-ccdb-backend]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/38}
./digi.log[9304:internal-dpl-ccdb-backend]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/40}
./digi.log[9304:internal-dpl-ccdb-backend]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/42}
./digi.log[9304:internal-dpl-ccdb-backend]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/43}
./digi.log[9304:internal-dpl-ccdb-backend]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/45}
./digi.log[9304:internal-dpl-ccdb-backend]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/46}
./digi.log[9304:internal-dpl-ccdb-backend]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/47}
./digi.log[9304:internal-dpl-ccdb-backend]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/48}
./digi.log[9304:internal-dpl-ccdb-backend]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/49}
./digi.log[9316:TPCDigitizer_7]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/37}
./digi.log[9316:TPCDigitizer_7]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/38}
./digi.log[9316:TPCDigitizer_7]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/40}
./digi.log[9316:TPCDigitizer_7]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/42}
./digi.log[9316:TPCDigitizer_7]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/43}
./digi.log[9316:TPCDigitizer_7]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/45}
./digi.log[9316:TPCDigitizer_7]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/46}
./digi.log[9316:TPCDigitizer_7]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/47}
./digi.log[9316:TPCDigitizer_7]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/48}
./digi.log[9316:TPCDigitizer_7]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/49}
./digi.log[9302:internal-dpl-clock]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/37}
./digi.log[9302:internal-dpl-clock]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/38}
./digi.log[9302:internal-dpl-clock]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/40}
./digi.log[9302:internal-dpl-clock]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/42}
./digi.log[9302:internal-dpl-clock]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/43}
./digi.log[9302:internal-dpl-clock]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/45}
./digi.log[9302:internal-dpl-clock]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/46}
./digi.log[9302:internal-dpl-clock]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/47}
./digi.log[9302:internal-dpl-clock]: [ERROR] Found duplicate input binding with different spec.:collisioncontext {SIM/COLLISIONCONTEXT/48}
[0 more errors; see full log]
Full log here.
Good morning @ktf, Could you review the changes from 2 weeks ago and answer my comment? I will be grateful!
@ktf Would you have time to take a look at this PR once more?
The string_view API has an efficient starts_with() method, which would check exactly the case I want to handle and it is very readable. Then I think that creating string_view from C::columnLabel() is more performant, because strlen() is being executed once, only on creation. Without it, it is executed on every comparison, since in the worst case, there are 2 comparisons, it should increase worst-case performance. What do you think of this:
template <typename R, typename T, persistent_with_common_getter<R> C>
ColumnGetterFunction<R, T> createGetterPtr(const std::string_view& targetColumnLabel)
{
return targetColumnLabel == C::columnLabel() ? &getColumnValue<R, T, C> : nullptr;
}
template <typename R, typename T, dynamic_with_common_getter<R> C>
ColumnGetterFunction<R, T> createGetterPtr(const std::string_view& targetColumnLabel)
{
std::string_view columnLabel(C::columnLabel());
// allows user to use consistent formatting (with prefix) of all column labels
// by default there isn't 'f' prefix for dynamic column labels
if (targetColumnLabel.starts_with("f") && targetColumnLabel.substr(1) == columnLabel) {
return &getColumnValue<R, T, C>;
}
// check also exact match if user is aware of prefix missing
if (targetColumnLabel == columnLabel) {
return &getColumnValue<R, T, C>;
}
return nullptr;
}
Error while checking build/O2/fullCI for 962a8d2e9d6cd6cdffe78face3b4808664dc1e9e at 2024-12-10 06:29:
## sw/BUILD/O2-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2188:38: error: 'is_dynamic_v' was not declared in this scope; did you mean 'is_dynamic_t'?
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2188:52: error: expected primary-expression before '>' token
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2190:38: error: label 'framework' referenced outside of any function
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2190:83: error: expected ';' before ')' token
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2190:139: error: expected unqualified-id before ')' token
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2188:38: error: 'is_dynamic_v' was not declared in this scope; did you mean 'is_dynamic_t'?
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2188:52: error: expected primary-expression before '>' token
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2190:38: error: label 'framework' referenced outside of any function
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2190:83: error: expected ';' before ')' token
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2190:139: error: expected unqualified-id before ')' token
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2188:38: error: 'is_dynamic_v' was not declared in this scope; did you mean 'is_dynamic_t'?
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2188:52: error: expected primary-expression before '>' token
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2190:38: error: label 'framework' referenced outside of any function
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2190:83: error: expected ';' before ')' token
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2190:139: error: expected unqualified-id before ')' token
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2188:38: error: 'is_dynamic_v' was not declared in this scope; did you mean 'is_dynamic_t'?
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2188:52: error: expected primary-expression before '>' token
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2190:38: error: label 'framework' referenced outside of any function
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2190:83: error: expected ';' before ')' token
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2190:139: error: expected unqualified-id before ')' token
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2188:38: error: 'is_dynamic_v' was not declared in this scope; did you mean 'is_dynamic_t'?
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2188:52: error: expected primary-expression before '>' token
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2190:38: error: label 'framework' referenced outside of any function
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2190:83: error: expected ';' before ')' token
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2190:139: error: expected unqualified-id before ')' token
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2188:38: error: 'is_dynamic_v' was not declared in this scope; did you mean 'is_dynamic_t'?
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2188:52: error: expected primary-expression before '>' token
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2190:38: error: label 'framework' referenced outside of any function
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2190:83: error: expected ';' before ')' token
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2190:139: error: expected unqualified-id before ')' token
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2188:38: error: 'is_dynamic_v' was not declared in this scope; did you mean 'is_dynamic_t'?
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2188:52: error: expected primary-expression before '>' token
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2190:38: error: label 'framework' referenced outside of any function
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2190:83: error: expected ';' before ')' token
/sw/SOURCES/O2/13498-slc8_x86-64/0/Framework/Core/include/Framework/ASoA.h:2190:139: error: expected unqualified-id before ')' token
ninja: build stopped: subcommand failed.
Full log here.
Error while checking build/O2/fullCI_slc9 for 962a8d2e9d6cd6cdffe78face3b4808664dc1e9e at 2024-12-04 12:06:
## sw/BUILD/O2-latest/log
ninja: build stopped: subcommand failed.
Full log here.
@ktf I have rebased PR, and now all checks are successful.
Thank you for following this through. Merging this. Could you please move the helper functions which are only internal into an anonymous namespace?