doris
doris copied to clipboard
[improvement](memory) Storage page cache use LRU-K cache, K=2
Proposed changes
Storage page cache uses plain LRU Cache, occasional batch operations can cause "cache pollution" in plain LRU Cache. This will cause hotspot data to be squeezed out of the cache by non-hotspot data, reduce cache hit rate.
In extreme cases, if the number of pages inserted each time is greater than the cache capacity, the cache hit rate will be 0.
Introducing LRU-K Cache avoids "cache pollution" in most cases.
Further comments
If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...
run buildall
TeamCity be ut coverage result: Function Coverage: 36.47% (8537/23407) Line Coverage: 28.60% (69450/242820) Region Coverage: 27.63% (35945/130105) Branch Coverage: 24.38% (18380/75400) Coverage Report: http://coverage.selectdb-in.cc/coverage/bfd3a073833fc389f528039ed53c428e901d355c_bfd3a073833fc389f528039ed53c428e901d355c/report/index.html
(From new machine)TeamCity pipeline, clickbench performance test result: the sum of best hot time: 42.51 seconds stream load tsv: 568 seconds loaded 74807831229 Bytes, about 125 MB/s stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s stream load parquet: 33 seconds loaded 861443392 Bytes, about 24 MB/s insert into select: 28.6 seconds inserted 10000000 Rows, about 349K ops/s storage size: 17183496267 Bytes
run buildall
TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit bfd3a073833fc389f528039ed53c428e901d355c, data reload: false
run tpch-sf100 query with default conf and session variables
q1 4582 3856 3820 3820
q2 316 280 159 159
q3 1300 1385 1004 1004
q4 826 944 534 534
q5 2677 2634 2802 2634
q6 202 163 130 130
q7 892 636 484 484
q8 1592 1765 1725 1725
q9 5912 5804 5806 5804
q10 2924 3079 2597 2597
q11 360 209 194 194
q12 331 326 208 208
q13 4523 4505 3824 3824
q14 243 238 215 215
q15 599 520 524 520
q16 438 435 393 393
q17 644 959 303 303
q18 7118 6723 6731 6723
q19 992 1196 1092 1092
q20 464 447 318 318
q21 2979 1981 2082 1981
q22 377 331 288 288
Total cold run time: 40291 ms
Total hot run time: 34950 ms
run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1 4166 4165 4141 4141
q2 249 270 177 177
q3 3261 3300 2971 2971
q4 2133 2213 1789 1789
q5 5321 5330 5293 5293
q6 232 152 124 124
q7 2223 2004 1845 1845
q8 3084 3076 2963 2963
q9 8223 8160 8138 8138
q10 3750 3838 3371 3371
q11 563 404 367 367
q12 760 756 608 608
q13 4253 4269 3558 3558
q14 287 290 261 261
q15 597 523 526 523
q16 513 507 484 484
q17 1700 1753 1569 1569
q18 8296 7945 7866 7866
q19 1210 1203 1207 1203
q20 2221 2184 1942 1942
q21 6069 5178 5299 5178
q22 526 464 412 412
Total cold run time: 59637 ms
Total hot run time: 54783 ms
(From new machine)TeamCity pipeline, clickbench performance test result: the sum of best hot time: 42.25 seconds stream load tsv: 572 seconds loaded 74807831229 Bytes, about 124 MB/s stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s insert into select: 28.7 seconds inserted 10000000 Rows, about 348K ops/s storage size: 17184186377 Bytes
TeamCity be ut coverage result: Function Coverage: 36.47% (8536/23406) Line Coverage: 28.60% (69443/242848) Region Coverage: 27.63% (35954/130124) Branch Coverage: 24.37% (18382/75418) Coverage Report: http://coverage.selectdb-in.cc/coverage/32ebcb7ab0e1acf07d88a9eee3698de7e738da0d_32ebcb7ab0e1acf07d88a9eee3698de7e738da0d/report/index.html
TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit 32ebcb7ab0e1acf07d88a9eee3698de7e738da0d, data reload: false
run tpch-sf100 query with default conf and session variables
q1 4568 3805 3864 3805
q2 326 270 158 158
q3 1314 1400 1028 1028
q4 817 926 521 521
q5 2650 2686 2749 2686
q6 201 157 129 129
q7 887 646 495 495
q8 1597 1725 1696 1696
q9 5904 5769 5754 5754
q10 2904 3097 2598 2598
q11 354 207 204 204
q12 337 338 207 207
q13 4557 4512 3834 3834
q14 242 237 214 214
q15 602 533 527 527
q16 442 431 387 387
q17 653 923 305 305
q18 7211 6801 6766 6766
q19 1003 1171 1120 1120
q20 498 456 298 298
q21 2946 1990 2081 1990
q22 375 329 288 288
Total cold run time: 40388 ms
Total hot run time: 35010 ms
run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1 4159 4153 4202 4153
q2 249 272 173 173
q3 3293 3307 2970 2970
q4 2135 2215 1792 1792
q5 5346 5302 5317 5302
q6 234 153 122 122
q7 2267 1997 1867 1867
q8 3083 3052 3000 3000
q9 8208 8159 8178 8159
q10 3735 3849 3374 3374
q11 554 390 379 379
q12 765 766 598 598
q13 4273 4299 3580 3580
q14 287 291 275 275
q15 617 521 524 521
q16 527 509 487 487
q17 1724 1765 1639 1639
q18 8322 7778 7915 7778
q19 1214 1202 1192 1192
q20 2223 2186 1944 1944
q21 6110 5213 5334 5213
q22 530 469 426 426
Total cold run time: 59855 ms
Total hot run time: 54944 ms
run buildall
TeamCity be ut coverage result: Function Coverage: 35.72% (8549/23936) Line Coverage: 27.55% (69419/251951) Region Coverage: 26.71% (36016/134827) Branch Coverage: 23.52% (18419/78298) Coverage Report: http://coverage.selectdb-in.cc/coverage/f73e9c404bcb1213848650d993e0b76626675ca1_f73e9c404bcb1213848650d993e0b76626675ca1/report/index.html
run buildall
clang-tidy review says "All clean, LGTM! :+1:"
run buildall
clang-tidy review says "All clean, LGTM! :+1:"
TeamCity be ut coverage result: Function Coverage: 35.73% (8548/23924) Line Coverage: 27.54% (69399/251976) Region Coverage: 26.70% (36008/134848) Branch Coverage: 23.51% (18409/78314) Coverage Report: http://coverage.selectdb-in.cc/coverage/c0afcf21c73fb5bb7d3f240e1e1b3e2916ef6d10_c0afcf21c73fb5bb7d3f240e1e1b3e2916ef6d10/report/index.html
TPC-H: Total hot run time: 35500 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c0afcf21c73fb5bb7d3f240e1e1b3e2916ef6d10, data reload: false
------ Round 1 ----------------------------------
q1 17687 4513 4362 4362
q2 2040 244 133 133
q3 10495 1442 1003 1003
q4 4669 1216 1009 1009
q5 7610 2782 2603 2603
q6 185 157 135 135
q7 1037 967 765 765
q8 9198 1469 1101 1101
q9 6440 5352 5302 5302
q10 8170 3074 2644 2644
q11 419 243 228 228
q12 688 471 354 354
q13 17935 4435 3658 3658
q14 291 287 268 268
q15 685 519 508 508
q16 456 458 416 416
q17 453 618 479 479
q18 7033 6315 6429 6315
q19 1289 671 600 600
q20 414 433 289 289
q21 6080 2992 3156 2992
q22 385 377 336 336
Total cold run time: 103659 ms
Total hot run time: 35500 ms
----- Round 2, with runtime_filter_mode=off -----
q1 5055 5015 4974 4974
q2 269 310 193 193
q3 3261 3397 2901 2901
q4 2252 2369 1726 1726
q5 5317 5115 5108 5108
q6 215 165 128 128
q7 2136 1848 1721 1721
q8 2669 2367 2380 2367
q9 7837 7779 7751 7751
q10 3892 4050 3573 3573
q11 642 474 413 413
q12 766 786 565 565
q13 3820 4233 3443 3443
q14 260 268 241 241
q15 662 515 500 500
q16 459 480 465 465
q17 1599 1601 1557 1557
q18 8175 7410 7355 7355
q19 764 731 733 731
q20 2048 2052 1867 1867
q21 6074 5328 5360 5328
q22 599 574 493 493
Total cold run time: 58771 ms
Total hot run time: 53400 ms
TPC-DS: Total hot run time: 175616 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c0afcf21c73fb5bb7d3f240e1e1b3e2916ef6d10, data reload: false
query1 924 349 337 337
query2 6539 1905 1905 1905
query3 6708 205 202 202
query4 23316 21504 21222 21222
query5 4232 502 375 375
query6 257 184 173 173
query7 4608 486 311 311
query8 253 205 204 204
query9 8447 2815 2797 2797
query10 404 278 219 219
query11 15007 14906 14469 14469
query12 146 85 87 85
query13 1713 544 421 421
query14 9650 7724 7851 7724
query15 212 221 189 189
query16 7495 413 256 256
query17 1349 673 527 527
query18 1954 345 269 269
query19 181 176 144 144
query20 85 84 88 84
query21 188 150 118 118
query22 4824 4809 4653 4653
query23 33007 31485 31496 31485
query24 8108 2746 2802 2746
query25 437 418 349 349
query26 1206 271 165 165
query27 2719 428 314 314
query28 4355 1822 1804 1804
query29 876 775 643 643
query30 207 166 140 140
query31 919 858 765 765
query32 90 63 61 61
query33 473 302 231 231
query34 775 817 493 493
query35 902 890 835 835
query36 960 912 913 912
query37 89 91 62 62
query38 3230 3272 3225 3225
query39 1405 1337 1339 1337
query40 205 123 109 109
query41 38 35 34 34
query42 103 104 98 98
query43 506 481 464 464
query44 1191 697 712 697
query45 193 190 182 182
query46 801 985 622 622
query47 1606 1626 1550 1550
query48 406 427 348 348
query49 650 398 314 314
query50 603 630 377 377
query51 4444 4339 4299 4299
query52 108 98 95 95
query53 378 401 308 308
query54 246 247 226 226
query55 84 88 79 79
query56 218 239 202 202
query57 1012 1007 925 925
query58 219 210 206 206
query59 2365 2477 2346 2346
query60 228 234 219 219
query61 85 87 84 84
query62 521 434 378 378
query63 363 295 298 295
query64 5114 2971 2491 2491
query65 3330 3248 3251 3248
query66 991 426 329 329
query67 14592 14401 14178 14178
query68 2266 925 500 500
query69 492 383 353 353
query70 1249 1207 1178 1178
query71 305 276 253 253
query72 5082 2887 2764 2764
query73 497 603 312 312
query74 6847 6590 6363 6363
query75 3034 2975 2564 2564
query76 2193 1037 658 658
query77 281 319 232 232
query78 9252 9345 8777 8777
query79 895 864 502 502
query80 530 475 357 357
query81 404 247 207 207
query82 917 119 87 87
query83 242 144 120 120
query84 227 97 79 79
query85 800 419 349 349
query86 325 322 317 317
query87 3374 3426 3241 3241
query88 2890 2296 2296 2296
query89 469 417 367 367
query90 1687 169 167 167
query91 143 158 126 126
query92 54 56 49 49
query93 864 943 497 497
query94 496 308 183 183
query95 421 357 342 342
query96 457 578 262 262
query97 4394 4464 4250 4250
query98 224 208 197 197
query99 842 813 713 713
Total cold run time: 251003 ms
Total hot run time: 175616 ms
ClickBench: Total hot run time: 28.83 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit c0afcf21c73fb5bb7d3f240e1e1b3e2916ef6d10, data reload: false
query1 0.02 0.03 0.02
query2 0.06 0.02 0.03
query3 0.23 0.07 0.07
query4 1.65 0.08 0.08
query5 0.48 0.48 0.48
query6 1.33 0.61 0.62
query7 0.01 0.01 0.01
query8 0.04 0.02 0.02
query9 0.51 0.47 0.46
query10 0.49 0.49 0.49
query11 0.13 0.10 0.10
query12 0.12 0.10 0.10
query13 0.58 0.59 0.59
query14 0.76 0.78 0.77
query15 0.83 0.80 0.79
query16 0.33 0.33 0.34
query17 0.96 0.90 0.89
query18 0.17 0.18 0.18
query19 1.80 1.62 1.65
query20 0.01 0.01 0.02
query21 15.43 1.08 0.58
query22 0.70 0.88 0.81
query23 14.92 1.53 0.58
query24 2.19 0.74 0.29
query25 0.67 0.08 0.06
query26 0.15 0.14 0.14
query27 0.05 0.06 0.05
query28 13.03 1.48 0.80
query29 12.57 3.95 3.28
query30 0.52 0.49 0.45
query31 2.77 0.57 0.36
query32 3.21 0.55 0.48
query33 3.12 3.13 3.16
query34 14.84 5.08 4.46
query35 4.52 4.51 4.49
query36 1.05 0.99 0.95
query37 0.07 0.05 0.05
query38 0.04 0.03 0.02
query39 0.02 0.02 0.02
query40 0.18 0.15 0.15
query41 0.07 0.01 0.02
query42 0.02 0.02 0.01
query43 0.02 0.03 0.02
Total cold run time: 100.67 s
Total hot run time: 28.83 s
Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Load test result on commit c0afcf21c73fb5bb7d3f240e1e1b3e2916ef6d10 with default session variables
Stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
Stream load orc: 61 seconds loaded 1101869774 Bytes, about 17 MB/s
Stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
Insert into select: 15.1 seconds inserted 10000000 Rows, about 662K ops/s