valkey icon indicating copy to clipboard operation
valkey copied to clipboard

Directly store score as double in the dict of zset.

Open RayaCoo opened this issue 1 year ago • 3 comments

In the past, we used void * for dict to store the score of elements in zset. This mean we had to dereference the score ervey time we accesse it. For example, *(double *)dictGetVal(de). This added extra dereference overhead, and for zunion[store] command, value of dict has been set twice. like this :

else if (op == SET_OP_UNION) {
    ...
    dictSetDoubleVal(de, score);
    ...
    dictSetVal(dstzset->dict, de, &znode->score);
    ...
}

Since double field has been add in dictEntry value union by commit, and for zset, the value stored in dict will always be double type, we can directly store score as double, which can improves the code's readability and reduces extra dereferencing overhead.

RayaCoo avatar Aug 28 '24 11:08 RayaCoo

Codecov Report

Attention: Patch coverage is 96.00000% with 2 lines in your changes missing coverage. Please review.

Project coverage is 70.51%. Comparing base (a102852) to head (49cad34). Report is 180 commits behind head on unstable.

Files with missing lines Patch % Lines
src/module.c 0.00% 2 Missing :warning:
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable     #959      +/-   ##
============================================
- Coverage     70.70%   70.51%   -0.20%     
============================================
  Files           114      114              
  Lines         63157    63163       +6     
============================================
- Hits          44653    44537     -116     
- Misses        18504    18626     +122     
Files with missing lines Coverage Δ
src/aof.c 80.13% <100.00%> (ø)
src/db.c 88.77% <100.00%> (ø)
src/debug.c 53.05% <100.00%> (ø)
src/defrag.c 84.91% <100.00%> (+0.56%) :arrow_up:
src/geo.c 93.61% <100.00%> (+0.01%) :arrow_up:
src/rdb.c 76.35% <100.00%> (-0.32%) :arrow_down:
src/t_zset.c 95.62% <100.00%> (+0.05%) :arrow_up:
src/module.c 9.66% <0.00%> (ø)

... and 11 files with indirect coverage changes

codecov[bot] avatar Aug 28 '24 12:08 codecov[bot]

@RayaCoo Could you run some benchmark around this to check if there is any performance gain?

hpatro avatar Sep 04 '24 14:09 hpatro

@hpatro If @RayaCoo doesn't respond, would you be able to run them and we can then decide to merge this?

madolson avatar Oct 01 '24 17:10 madolson

I don't see any performance gain/drop with valkey-benchmark ZADD/ZPOPMIN benchmark suite. However, the change makes sense. @madolson maybe we merge it for maintainability.

hpatro avatar Oct 29 '24 20:10 hpatro

Due to personal reasons, I haven't completed the tests and replied on time and I'm sorry about that. @madolson @hpatro please check this out. I rebased the commit onto the latest codebase and finished testing several related commands: ZADD, ZRANK, ZSCORE, ZINTER[STORE], ZUNION[STORE], and ZDIFF[STORE].

Simple summary:

For ZRANK command, the performance imporved about 2.14%. For ZSCORE command, the performance impored about 1.73%. For ZADD command, the performance improved about 0.8%. For ZINTER[SOTER], ZUNION[STORE] and ZDIFF[STORE], this PR doesn't significantly affect the performance.

Considering measurement variance, the performance improvements may fluctuate, but it does enhance the performance of certain commands. For code readability, I think maybe directly use double field to store score is more intuitive for humans.

Test Method and Result

ZADD

test method

taskset -c 0-3 ./src/valkey-server --save ""
taskset -c 4-6 ./src/valkey-benchmark -P 10 -n 10000000 -t zadd

result

  • unstable
Summary:
  throughput summary: 556080.75 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.828     0.224     0.799     1.079     1.111     2.639
  • This PR
Summary:
  throughput summary: 560506.69 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.827     0.232     0.839     1.055     1.087     2.311

ZSCORE

test method

taskset -c 0-3 ./src/valkey-server --port 12345 --save ""
python ./addDataToZset.py -k zset1 -p 12345 -min 0 -max 200
taskset -c 4-6 ./src/valkey-benchmark -p 12345 -P 10 -n 50000000 zscore zset1 member_100

ps: The addDataToZset script is used to add elements in the format {member_n: n} to a specific key's zset, n is a value between the specified min and max.

result

  • unstable
Summary:
  throughput summary: 726934.38 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.624     0.160     0.599     0.815     0.839     2.175
  • This PR
Summary:
  throughput summary: 742511.81 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.610     0.160     0.583     0.799     0.823     5.087

ZRANK

test method

taskset -c 0-3 ./src/valkey-server --port 12345 --save ""
python ./addDataToZset.py -k zset1 -p 12345 -min 0 -max 200
taskset -c 4-6 ./src/valkey-benchmark -p 12345 -P 10 -n 50000000 zrank zset1 member_100 withscore

result

  • unstable
Summary:
  throughput summary: 614220.44 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.719     0.232     0.687     0.951     0.991     2.511

This PR

Summary:
  throughput summary: 624875.06 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.704     0.232     0.671     0.935     0.967     4.439

ZINTER[STORE]

test method

taskset -c 0-3 ./src/valkey-server --port 12345 --save ""
python ./addDataToZset.py -p 12345 -k zset1 -min 0 -max 200
python ./addDataToZset.py -p 12345 -k zset2 -min 0 -max 400

taskset -c 4-6 ./src/valkey-benchmark -p 12345 -P 10 -n 1000000 zinter 2 zset1 zset2
taskset -c 4-6 ./src/valkey-benchmark -p 12345 -P 10 -n 1000000 zinterstore zset3 2 zset1 zset2

result

  • unstable

    • ZINTER
    Summary:
      throughput summary: 12148.45 requests per second
      latency summary (msec):
              avg       min       p50       p95       p99       max
           35.366     1.000    35.519    39.359    41.151    57.119
    
    • ZINTERSTORE
    Summary:
      throughput summary: 13543.71 requests per second
      latency summary (msec):
              avg       min       p50       p95       p99       max
           36.833     0.880    40.703    43.263    43.679    52.351
    
  • This PR

    • ZINTER
    Summary:
      throughput summary: 12129.30 requests per second
      latency summary (msec):
              avg       min       p50       p95       p99       max
           36.080     0.992    36.799    38.975    41.119    59.775
    
    • ZINTERSTORE
    Summary:
      throughput summary: 13600.08 requests per second
      latency summary (msec):
              avg       min       p50       p95       p99       max
           36.680     0.944    40.575    43.103    43.679    53.503
    

ZUNION[STORE]

test method

taskset -c 0-3 ./src/valkey-server --port 12345 --save ""
python ./addDataToZset.py -p 12345 -k zset1 -min 0 -max 200
python ./addDataToZset.py -p 12345 -k zset2 -min 0 -max 400

taskset -c 4-6 ./src/valkey-benchmark -p 12345 -P 10 -n 500000 zunion 2 zset1 zset2
taskset -c 4-6 ./src/valkey-benchmark -p 12345 -P 10 -n 500000 zunionstore zset3 2 zset1 zset2

result

  • unstable
    • ZUNION
    Summary:
      throughput summary: 5989.53 requests per second
      latency summary (msec):
              avg       min       p50       p95       p99       max
           70.335     1.688    70.655    74.175    79.871    99.647
    
    • ZUNIONSTORE
    Summary:
      throughput summary: 6810.41 requests per second
      latency summary (msec):
              avg       min       p50       p95       p99       max
           73.302     1.592    79.359    80.895    82.111    98.303
    
  • This PR
    • ZUNION
    Summary:
      throughput summary: 5949.55 requests per second
      latency summary (msec):
              avg       min       p50       p95       p99       max
           70.517     1.712    70.399    75.583    78.143    99.007
    
    • ZUNIONSTORE
    Summary:
      throughput summary: 6735.55 requests per second
      latency summary (msec):
              avg       min       p50       p95       p99       max
           74.116     1.600    80.191    81.791    82.943   102.911
    

ZDIFF[STORE]

test method

taskset -c 0-3 ./src/valkey-server --port 12345 --save ""
python ./addDataToZset.py -p 12345 -k zset1 -min 0 -max 200
python ./addDataToZset.py -p 12345 -k zset2 -min 0 -max 400

taskset -c 4-6 ./src/valkey-benchmark -p 12345 -P 10 -n 5000000 zdiff 2 zset1 zset2
taskset -c 4-6 ./src/valkey-benchmark -p 12345 -P 10 -n 5000000 zdiffstore zset3 2 zset1 zset2

result

  • unstable
    • ZDIFF
    Summary:
      throughput summary: 83794.20 requests per second
      latency summary (msec):
              avg       min       p50       p95       p99       max
            5.904     0.832     6.911     7.223     7.383    10.343
    
    • ZDIFFSTORE
    Summary:
      throughput summary: 84455.18 requests per second
      latency summary (msec):
              avg       min       p50       p95       p99       max
            5.852     0.848     6.759     7.015     7.135    11.503
    
  • This PR
    • ZDIFF
    Summary:
      throughput summary: 85446.72 requests per second
      latency summary (msec):
              avg       min       p50       p95       p99       max
            5.789     0.832     6.767     7.039     7.151    12.743
    
    • ZDIFFSTORE
    Summary:
      throughput summary: 84556.59 requests per second
      latency summary (msec):
              avg       min       p50       p95       p99       max
            5.848     0.896     6.783     7.055     7.167    11.103
    

RayaCoo avatar Oct 31 '24 12:10 RayaCoo

I suppose a downside of this is now there is more room for error.

Could you explain this a bit more?

hpatro avatar Nov 04 '24 19:11 hpatro

Could you explain this a bit more?

Sorry. I meant that we are now independently maintaining the double value in two places as opposed to a pointer to a value. There is more risk for them diverging. If the pointer is wrong it'll probably crash, if the double is wrong it'll just be a silent divergent.

madolson avatar Nov 05 '24 19:11 madolson

Sorry. I meant that we are now independently maintaining the double value in two places as opposed to a pointer to a value. There is more risk for them diverging. If the pointer is wrong it'll probably crash, if the double is wrong it'll just be a silent divergent.

I have simply changed the way we modify and retrieve the score in zset to use the double value directly. The modifications are always made in pairs (first updating the skiplist, then the dict). Additionally, our current tests have not revealed any issues with this approach. I believe the previous zset implementation was robust, as no crashes occurred due to pointer issues. Thus, modifying the update method should not increase the risk of errors.

RayaCoo avatar Nov 07 '24 05:11 RayaCoo

Please don't merge this.

I forgot to comment about it earlier, but I plan to replace the dict with another hash table and use the skiplist node as the hashtable entry. #1096

Merging this would also make #1389 impossible.

zuiderkwast avatar Dec 05 '24 18:12 zuiderkwast

Closing this PR as the underlying structure has been changed. Nevertheless thanks for your work though @RayaCoo.

hpatro avatar Jan 17 '25 20:01 hpatro