pisa
pisa copied to clipboard
Move codec implementations to cpp and make codec dependencies private
Because the implementation of block-wise decode (and encode but this one is not as crucial) is moved out of the header, there is a potential that this will affect performance.
Personally, I don't think these are being inlined anyway but should be tested. I additionally added PISA_ENABLE_IPO cmake option in case we want to test link time optimization.
Codecov Report
Merging #465 (f5945bb) into master (a28100d) will decrease coverage by
0.07%. The diff coverage is50.42%.
@@ Coverage Diff @@
## master #465 +/- ##
==========================================
- Coverage 93.26% 93.18% -0.08%
==========================================
Files 92 86 -6
Lines 4926 4785 -141
==========================================
- Hits 4594 4459 -135
+ Misses 332 326 -6
| Impacted Files | Coverage Δ | |
|---|---|---|
| include/pisa/codec/block_codecs.hpp | 86.61% <ø> (-1.78%) |
:arrow_down: |
| include/pisa/codec/varintgb.hpp | 62.79% <ø> (ø) |
|
| include/pisa/forward_index.hpp | 95.12% <ø> (ø) |
|
| include/pisa/codec/VarIntG8IU.h | 50.84% <50.42%> (+0.84%) |
:arrow_up: |
| include/pisa/bit_vector.hpp | 97.54% <0.00%> (-0.41%) |
:arrow_down: |
| include/pisa/block_posting_list.hpp | 99.21% <0.00%> (+0.01%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update a28100d...f5945bb. Read the comment docs.
@JMMackenzie do you think you'd be able to run a benchmark comparing this to master? I think one block codec should be enough to verify if there's any regression, unless by looking at the code you can see more potential regressions...
CW09B with block_simdbp encoding. About 5000 queries of varying lengths.Based on the numbers below, it looks like we are paying a little for this.
AND k = 1000
Master: {"type": "block_simdbp", "query": "and", "avg": 10647.6, "q50": 3791, "q90": 28323, "q95": 43043, "q99": 93122}
This: {"type": "block_simdbp", "query": "and", "avg": 10747.1, "q50": 3743, "q90": 28466, "q95": 43279, "q99": 94556}
MaxScore k = 10
Master: {"type": "block_simdbp", "query": "maxscore", "avg": 19966.6, "q50": 10714, "q90": 51073, "q95": 69106, "q99": 115939}
This: {"type": "block_simdbp", "query": "maxscore", "avg": 24766.2, "q50": 14654, "q90": 62199, "q95": 81999, "q99": 126860}
MaxScore k = 1000
Master: {"type": "block_simdbp", "query": "maxscore", "avg": 35093.4, "q50": 24444, "q90": 79897, "q95": 104968, "q99": 170087}
This: {"type": "block_simdbp", "query": "maxscore", "avg": 39695.4, "q50": 29151, "q90": 88562, "q95": 114082, "q99": 176220}
Thanks @JMMackenzie can you try passing -DPISA_ENABLE_IPO to cmake, recompile and run again? I'm curious if this helps.
Actually, I see a bug there, I name the option with PISA_ but the check is without it (right below). You'd need to fix that. Whatever you do, please check if the lto flag is actually passed to the linker to make sure we compare the right thing.
Actually, I see a bug there, I name the option with PISA_ but the check is without it (right below). You'd need to fix that. Whatever you do, please check if the lto flag is actually passed to the linker to make sure we compare the right thing.
Already saw and fixed that. Having some trouble with CMake now, working through it. Will report back.
You'll also need:
cmake_policy(SET CMP0069 NEW) # Use IPO (LTO) when requested
Doesn't seem to help too much:
AND k = 1000
Master: {"type": "block_simdbp", "query": "and", "avg": 10647.6, "q50": 3791, "q90": 28323, "q95": 43043, "q99": 93122}
This: {"type": "block_simdbp", "query": "and", "avg": 10747.1, "q50": 3743, "q90": 28466, "q95": 43279, "q99": 94556}
IPO: {"type": "block_simdbp", "query": "and", "avg": 10673.2, "q50": 3786, "q90": 28325, "q95": 42724, "q99": 93325}
MaxScore k = 10
Master: {"type": "block_simdbp", "query": "maxscore", "avg": 19966.6, "q50": 10714, "q90": 51073, "q95": 69106, "q99": 115939}
This: {"type": "block_simdbp", "query": "maxscore", "avg": 24766.2, "q50": 14654, "q90": 62199, "q95": 81999, "q99": 126860}
IPO: {"type": "block_simdbp", "query": "maxscore", "avg": 24060.2, "q50": 14279, "q90": 60437, "q95": 79765, "q99": 123746}
MaxScore k = 1000
Master: {"type": "block_simdbp", "query": "maxscore", "avg": 35093.4, "q50": 24444, "q90": 79897, "q95": 104968, "q99": 170087}
This: {"type": "block_simdbp", "query": "maxscore", "avg": 39695.4, "q50": 29151, "q90": 88562, "q95": 114082, "q99": 176220}
IPO: {"type": "block_simdbp", "query": "maxscore", "avg": 40741.5, "q50": 30760, "q90": 89645, "q95": 114450, "q99": 173881}
Are we willing to take this performance hit? I'm happy to revisit these tests soon if we want to make sure...
Are we willing to take this performance hit? I'm happy to revisit these tests soon if we want to make sure...
Possibly... but not entirely sure. Kind of want to take some time and think about it, mostly whether there's a different way of improving compilation but without the performance hit. Hope to have some time for it sometime this week.