valkey
valkey copied to clipboard
Directly store score as double in the dict of zset.
In the past, we used void * for dict to store the score of elements in zset. This mean we had to dereference the score ervey time we accesse it. For example, *(double *)dictGetVal(de). This added extra dereference overhead, and for zunion[store] command, value of dict has been set twice. like this :
else if (op == SET_OP_UNION) {
...
dictSetDoubleVal(de, score);
...
dictSetVal(dstzset->dict, de, &znode->score);
...
}
Since double field has been add in dictEntry value union by commit, and for zset, the value stored in dict will always be double type, we can directly store score as double, which can improves the code's readability and reduces extra dereferencing overhead.
Codecov Report
Attention: Patch coverage is 96.00000% with 2 lines in your changes missing coverage. Please review.
Project coverage is 70.51%. Comparing base (
a102852) to head (49cad34). Report is 180 commits behind head on unstable.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| src/module.c | 0.00% | 2 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## unstable #959 +/- ##
============================================
- Coverage 70.70% 70.51% -0.20%
============================================
Files 114 114
Lines 63157 63163 +6
============================================
- Hits 44653 44537 -116
- Misses 18504 18626 +122
| Files with missing lines | Coverage Δ | |
|---|---|---|
| src/aof.c | 80.13% <100.00%> (ø) |
|
| src/db.c | 88.77% <100.00%> (ø) |
|
| src/debug.c | 53.05% <100.00%> (ø) |
|
| src/defrag.c | 84.91% <100.00%> (+0.56%) |
:arrow_up: |
| src/geo.c | 93.61% <100.00%> (+0.01%) |
:arrow_up: |
| src/rdb.c | 76.35% <100.00%> (-0.32%) |
:arrow_down: |
| src/t_zset.c | 95.62% <100.00%> (+0.05%) |
:arrow_up: |
| src/module.c | 9.66% <0.00%> (ø) |
@RayaCoo Could you run some benchmark around this to check if there is any performance gain?
@hpatro If @RayaCoo doesn't respond, would you be able to run them and we can then decide to merge this?
I don't see any performance gain/drop with valkey-benchmark ZADD/ZPOPMIN benchmark suite. However, the change makes sense. @madolson maybe we merge it for maintainability.
Due to personal reasons, I haven't completed the tests and replied on time and I'm sorry about that.
@madolson @hpatro please check this out.
I rebased the commit onto the latest codebase and finished testing several related commands: ZADD, ZRANK, ZSCORE, ZINTER[STORE], ZUNION[STORE], and ZDIFF[STORE].
Simple summary:
For ZRANK command, the performance imporved about 2.14%.
For ZSCORE command, the performance impored about 1.73%.
For ZADD command, the performance improved about 0.8%.
For ZINTER[SOTER], ZUNION[STORE] and ZDIFF[STORE], this PR doesn't significantly affect the performance.
Considering measurement variance, the performance improvements may fluctuate, but it does enhance the performance of certain commands. For code readability, I think maybe directly use double field to store score is more intuitive for humans.
Test Method and Result
ZADD
test method
taskset -c 0-3 ./src/valkey-server --save ""
taskset -c 4-6 ./src/valkey-benchmark -P 10 -n 10000000 -t zadd
result
- unstable
Summary:
throughput summary: 556080.75 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.828 0.224 0.799 1.079 1.111 2.639
- This PR
Summary:
throughput summary: 560506.69 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.827 0.232 0.839 1.055 1.087 2.311
ZSCORE
test method
taskset -c 0-3 ./src/valkey-server --port 12345 --save ""
python ./addDataToZset.py -k zset1 -p 12345 -min 0 -max 200
taskset -c 4-6 ./src/valkey-benchmark -p 12345 -P 10 -n 50000000 zscore zset1 member_100
ps: The addDataToZset script is used to add elements in the format {member_n: n} to a specific key's zset, n is a value between the specified min and max.
result
- unstable
Summary:
throughput summary: 726934.38 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.624 0.160 0.599 0.815 0.839 2.175
- This PR
Summary:
throughput summary: 742511.81 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.610 0.160 0.583 0.799 0.823 5.087
ZRANK
test method
taskset -c 0-3 ./src/valkey-server --port 12345 --save ""
python ./addDataToZset.py -k zset1 -p 12345 -min 0 -max 200
taskset -c 4-6 ./src/valkey-benchmark -p 12345 -P 10 -n 50000000 zrank zset1 member_100 withscore
result
- unstable
Summary:
throughput summary: 614220.44 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.719 0.232 0.687 0.951 0.991 2.511
This PR
Summary:
throughput summary: 624875.06 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.704 0.232 0.671 0.935 0.967 4.439
ZINTER[STORE]
test method
taskset -c 0-3 ./src/valkey-server --port 12345 --save ""
python ./addDataToZset.py -p 12345 -k zset1 -min 0 -max 200
python ./addDataToZset.py -p 12345 -k zset2 -min 0 -max 400
taskset -c 4-6 ./src/valkey-benchmark -p 12345 -P 10 -n 1000000 zinter 2 zset1 zset2
taskset -c 4-6 ./src/valkey-benchmark -p 12345 -P 10 -n 1000000 zinterstore zset3 2 zset1 zset2
result
-
unstable
- ZINTER
Summary: throughput summary: 12148.45 requests per second latency summary (msec): avg min p50 p95 p99 max 35.366 1.000 35.519 39.359 41.151 57.119- ZINTERSTORE
Summary: throughput summary: 13543.71 requests per second latency summary (msec): avg min p50 p95 p99 max 36.833 0.880 40.703 43.263 43.679 52.351 -
This PR
- ZINTER
Summary: throughput summary: 12129.30 requests per second latency summary (msec): avg min p50 p95 p99 max 36.080 0.992 36.799 38.975 41.119 59.775- ZINTERSTORE
Summary: throughput summary: 13600.08 requests per second latency summary (msec): avg min p50 p95 p99 max 36.680 0.944 40.575 43.103 43.679 53.503
ZUNION[STORE]
test method
taskset -c 0-3 ./src/valkey-server --port 12345 --save ""
python ./addDataToZset.py -p 12345 -k zset1 -min 0 -max 200
python ./addDataToZset.py -p 12345 -k zset2 -min 0 -max 400
taskset -c 4-6 ./src/valkey-benchmark -p 12345 -P 10 -n 500000 zunion 2 zset1 zset2
taskset -c 4-6 ./src/valkey-benchmark -p 12345 -P 10 -n 500000 zunionstore zset3 2 zset1 zset2
result
- unstable
- ZUNION
Summary: throughput summary: 5989.53 requests per second latency summary (msec): avg min p50 p95 p99 max 70.335 1.688 70.655 74.175 79.871 99.647- ZUNIONSTORE
Summary: throughput summary: 6810.41 requests per second latency summary (msec): avg min p50 p95 p99 max 73.302 1.592 79.359 80.895 82.111 98.303 - This PR
- ZUNION
Summary: throughput summary: 5949.55 requests per second latency summary (msec): avg min p50 p95 p99 max 70.517 1.712 70.399 75.583 78.143 99.007- ZUNIONSTORE
Summary: throughput summary: 6735.55 requests per second latency summary (msec): avg min p50 p95 p99 max 74.116 1.600 80.191 81.791 82.943 102.911
ZDIFF[STORE]
test method
taskset -c 0-3 ./src/valkey-server --port 12345 --save ""
python ./addDataToZset.py -p 12345 -k zset1 -min 0 -max 200
python ./addDataToZset.py -p 12345 -k zset2 -min 0 -max 400
taskset -c 4-6 ./src/valkey-benchmark -p 12345 -P 10 -n 5000000 zdiff 2 zset1 zset2
taskset -c 4-6 ./src/valkey-benchmark -p 12345 -P 10 -n 5000000 zdiffstore zset3 2 zset1 zset2
result
- unstable
- ZDIFF
Summary: throughput summary: 83794.20 requests per second latency summary (msec): avg min p50 p95 p99 max 5.904 0.832 6.911 7.223 7.383 10.343- ZDIFFSTORE
Summary: throughput summary: 84455.18 requests per second latency summary (msec): avg min p50 p95 p99 max 5.852 0.848 6.759 7.015 7.135 11.503 - This PR
- ZDIFF
Summary: throughput summary: 85446.72 requests per second latency summary (msec): avg min p50 p95 p99 max 5.789 0.832 6.767 7.039 7.151 12.743- ZDIFFSTORE
Summary: throughput summary: 84556.59 requests per second latency summary (msec): avg min p50 p95 p99 max 5.848 0.896 6.783 7.055 7.167 11.103
I suppose a downside of this is now there is more room for error.
Could you explain this a bit more?
Could you explain this a bit more?
Sorry. I meant that we are now independently maintaining the double value in two places as opposed to a pointer to a value. There is more risk for them diverging. If the pointer is wrong it'll probably crash, if the double is wrong it'll just be a silent divergent.
Sorry. I meant that we are now independently maintaining the double value in two places as opposed to a pointer to a value. There is more risk for them diverging. If the pointer is wrong it'll probably crash, if the double is wrong it'll just be a silent divergent.
I have simply changed the way we modify and retrieve the score in zset to use the double value directly. The modifications are always made in pairs (first updating the skiplist, then the dict). Additionally, our current tests have not revealed any issues with this approach. I believe the previous zset implementation was robust, as no crashes occurred due to pointer issues. Thus, modifying the update method should not increase the risk of errors.
Please don't merge this.
I forgot to comment about it earlier, but I plan to replace the dict with another hash table and use the skiplist node as the hashtable entry. #1096
Merging this would also make #1389 impossible.
Closing this PR as the underlying structure has been changed. Nevertheless thanks for your work though @RayaCoo.