[fix](inverted_index) fix tokenization issues for some characters in ik analyzer
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary: This PR fixes the issue of IK Analyzer's abnormal handling of full-width characters and adds support for Emoji and rare character tokenization, consistent with Elasticsearch IK behavior.
The main content is as follows:
- Fix the issue of incorrect character classification.
- Full-width numbers, letters, and punctuation marks will be converted to half-width characters during output.
- Added SurrogatePairSegmenter for Emoji and rare characters.
- Increased unit tests and regression tests.
- Remove some unnecessary code.
Release note
None
Check List (For Author)
-
Test
- [X] Regression test
- [X] Unit Test
- [ ] Manual test (add detailed scripts or steps below)
- [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
- [ ] Previous test can cover this change.
- [ ] No code files have been changed.
- [ ] Other reason
-
Behavior changed:
- [X] No.
- [ ] Yes.
-
Does this need documentation?
- [X] No.
- [ ] Yes.
Check List (For Reviewer who merge this PR)
- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label
Thank you for your contribution to Apache Doris. Don't know what should be done next? See How to process your PR.
Please clearly describe your PR:
- What problem was fixed (it's best to include specific error reporting information). How it was fixed.
- Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
- What features were added. Why was this function added?
- Which code was refactored and why was this part of the code refactored?
- Which functions were optimized and what is the difference before and after the optimization?
run buildall
TPC-H: Total hot run time: 34949 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 5aaf69259ea171f55819df324a30a2fb01b51935, data reload: false
------ Round 1 ----------------------------------
q1 25874 5120 5051 5051
q2 2087 286 184 184
q3 10375 1269 710 710
q4 10248 1042 582 582
q5 7577 2431 2377 2377
q6 190 164 133 133
q7 933 744 623 623
q8 9323 1313 1161 1161
q9 6788 5130 5181 5130
q10 6839 2305 1901 1901
q11 494 289 268 268
q12 353 362 219 219
q13 17778 3699 3071 3071
q14 226 220 206 206
q15 533 499 492 492
q16 442 449 393 393
q17 611 891 388 388
q18 7369 7082 7024 7024
q19 1216 938 580 580
q20 351 356 227 227
q21 4489 3450 3273 3273
q22 1073 1006 956 956
Total cold run time: 115169 ms
Total hot run time: 34949 ms
----- Round 2, with runtime_filter_mode=off -----
q1 5136 5105 5093 5093
q2 245 331 233 233
q3 2222 2660 2292 2292
q4 1465 1843 1483 1483
q5 4564 4448 4345 4345
q6 212 166 125 125
q7 1963 1865 1755 1755
q8 2605 2594 2608 2594
q9 7120 7096 7142 7096
q10 2990 3210 2729 2729
q11 568 513 509 509
q12 701 785 615 615
q13 3455 3807 3328 3328
q14 291 305 270 270
q15 508 491 498 491
q16 466 492 463 463
q17 1139 1640 1389 1389
q18 7710 7509 7475 7475
q19 858 898 1086 898
q20 1965 1983 1867 1867
q21 5400 4662 4689 4662
q22 1091 1043 988 988
Total cold run time: 52674 ms
Total hot run time: 50700 ms
TPC-DS: Total hot run time: 185790 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 5aaf69259ea171f55819df324a30a2fb01b51935, data reload: false
query1 998 461 486 461
query2 6554 1804 1792 1792
query3 6741 231 215 215
query4 26304 23644 23026 23026
query5 4306 642 462 462
query6 292 198 192 192
query7 4630 476 285 285
query8 286 238 222 222
query9 8594 2590 2589 2589
query10 467 342 269 269
query11 15222 15024 14792 14792
query12 154 121 116 116
query13 1672 541 410 410
query14 8845 6331 6331 6331
query15 210 194 175 175
query16 7361 670 518 518
query17 1211 734 585 585
query18 1991 412 315 315
query19 199 192 161 161
query20 121 118 122 118
query21 215 124 106 106
query22 4071 4115 4092 4092
query23 34056 33102 32979 32979
query24 8449 2425 2408 2408
query25 544 471 399 399
query26 1276 261 153 153
query27 2757 500 341 341
query28 4290 2113 2098 2098
query29 777 550 433 433
query30 284 218 188 188
query31 931 847 780 780
query32 72 64 64 64
query33 556 389 322 322
query34 813 849 523 523
query35 810 806 726 726
query36 982 980 922 922
query37 123 103 78 78
query38 4056 4169 4088 4088
query39 1443 1421 1389 1389
query40 215 119 110 110
query41 61 54 53 53
query42 119 102 111 102
query43 498 504 474 474
query44 1324 813 809 809
query45 184 174 167 167
query46 843 1028 636 636
query47 1749 1809 1727 1727
query48 379 411 302 302
query49 768 515 445 445
query50 662 683 409 409
query51 4138 4152 4089 4089
query52 104 105 102 102
query53 234 258 183 183
query54 587 582 512 512
query55 83 89 85 85
query56 337 294 291 291
query57 1115 1138 1066 1066
query58 262 291 255 255
query59 2547 2651 2546 2546
query60 326 331 300 300
query61 131 127 125 125
query62 808 717 678 678
query63 225 188 191 188
query64 4348 1020 713 713
query65 4386 4253 4258 4253
query66 1155 414 308 308
query67 15697 15377 15152 15152
query68 7836 889 524 524
query69 480 301 260 260
query70 1176 1144 1056 1056
query71 459 328 359 328
query72 5770 4677 4687 4677
query73 669 564 349 349
query74 8876 9085 8649 8649
query75 3872 3177 2745 2745
query76 3649 1185 777 777
query77 809 415 283 283
query78 9891 10164 9354 9354
query79 1999 815 563 563
query80 599 521 450 450
query81 468 262 224 224
query82 423 125 98 98
query83 264 247 233 233
query84 245 105 93 93
query85 789 352 328 328
query86 336 299 298 298
query87 4366 4513 4339 4339
query88 3631 2224 2243 2224
query89 387 322 279 279
query90 1918 213 219 213
query91 139 141 111 111
query92 76 60 59 59
query93 1414 938 585 585
query94 670 424 306 306
query95 377 284 278 278
query96 490 555 275 275
query97 3125 3213 3129 3129
query98 242 206 203 203
query99 1464 1410 1297 1297
Total cold run time: 272857 ms
Total hot run time: 185790 ms
ClickBench: Total hot run time: 28.97 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 5aaf69259ea171f55819df324a30a2fb01b51935, data reload: false
query1 0.04 0.03 0.04
query2 0.12 0.10 0.11
query3 0.24 0.19 0.19
query4 1.59 0.19 0.11
query5 0.57 0.56 0.56
query6 1.19 0.70 0.72
query7 0.03 0.02 0.02
query8 0.04 0.04 0.03
query9 0.59 0.52 0.50
query10 0.57 0.56 0.57
query11 0.15 0.11 0.11
query12 0.14 0.11 0.11
query13 0.62 0.59 0.61
query14 1.20 1.18 1.18
query15 0.87 0.84 0.86
query16 0.39 0.38 0.38
query17 1.01 1.05 1.02
query18 0.21 0.20 0.19
query19 1.92 1.78 1.80
query20 0.02 0.01 0.01
query21 15.41 0.91 0.54
query22 0.78 1.31 0.60
query23 14.82 1.39 0.63
query24 7.41 1.69 0.29
query25 0.28 0.15 0.28
query26 0.68 0.16 0.14
query27 0.06 0.05 0.05
query28 9.68 0.86 0.43
query29 12.54 4.05 3.40
query30 0.25 0.09 0.06
query31 2.83 0.59 0.38
query32 3.22 0.55 0.47
query33 3.04 3.05 3.03
query34 15.84 5.12 4.50
query35 4.53 4.50 4.50
query36 0.68 0.49 0.48
query37 0.08 0.06 0.07
query38 0.06 0.04 0.04
query39 0.03 0.02 0.02
query40 0.17 0.14 0.13
query41 0.08 0.02 0.03
query42 0.03 0.02 0.02
query43 0.04 0.03 0.02
Total cold run time: 104.05 s
Total hot run time: 28.97 s
BE UT Coverage Report
Increment line coverage 92.78% (90/97) :tada:
Increment coverage report Complete coverage report
| Category | Coverage |
|---|---|
| Function Coverage | 53.16% (14425/27136) |
| Line Coverage | 42.04% (125053/297485) |
| Region Coverage | 40.83% (63843/156356) |
| Branch Coverage | 35.48% (32115/90508) |
run buildall
TPC-H: Total hot run time: 34240 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 6609880d8498cb44a7eea276830a0342a730fe14, data reload: false
------ Round 1 ----------------------------------
q1 25896 5066 4995 4995
q2 2065 332 201 201
q3 10315 1237 679 679
q4 10235 1002 533 533
q5 7510 2344 2348 2344
q6 176 162 134 134
q7 913 744 616 616
q8 9339 1308 1104 1104
q9 6927 5176 5146 5146
q10 6871 2307 1866 1866
q11 481 286 287 286
q12 357 356 222 222
q13 17782 3648 3068 3068
q14 222 220 210 210
q15 525 490 476 476
q16 442 444 408 408
q17 603 854 358 358
q18 7674 7227 7426 7227
q19 1472 970 590 590
q20 352 352 233 233
q21 4524 3426 2544 2544
q22 1073 1018 1000 1000
Total cold run time: 115754 ms
Total hot run time: 34240 ms
----- Round 2, with runtime_filter_mode=off -----
q1 5179 5059 5085 5059
q2 247 333 236 236
q3 2170 2636 2254 2254
q4 1427 1825 1442 1442
q5 4483 4350 4442 4350
q6 223 171 127 127
q7 2002 1910 1746 1746
q8 2589 2588 2540 2540
q9 7214 7234 7103 7103
q10 2974 3164 2754 2754
q11 562 486 488 486
q12 654 756 615 615
q13 3556 3833 3306 3306
q14 280 304 300 300
q15 528 487 481 481
q16 460 504 470 470
q17 1245 1559 1373 1373
q18 7877 7626 7486 7486
q19 803 820 851 820
q20 1950 1938 1814 1814
q21 5346 4790 4867 4790
q22 1110 1086 1010 1010
Total cold run time: 52879 ms
Total hot run time: 50562 ms
TPC-DS: Total hot run time: 192172 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 6609880d8498cb44a7eea276830a0342a730fe14, data reload: false
query1 1373 1106 1033 1033
query2 6229 1887 1858 1858
query3 11168 4548 4398 4398
query4 53704 24931 23403 23403
query5 4958 638 438 438
query6 339 211 186 186
query7 4878 501 283 283
query8 297 236 229 229
query9 5276 2554 2559 2554
query10 450 321 261 261
query11 15088 15031 14854 14854
query12 157 105 99 99
query13 1020 495 377 377
query14 10093 6275 6369 6275
query15 196 213 183 183
query16 7062 668 540 540
query17 1091 751 585 585
query18 1550 416 327 327
query19 217 200 176 176
query20 138 125 128 125
query21 211 129 107 107
query22 4317 4465 4335 4335
query23 34030 33378 33627 33378
query24 6764 2376 2440 2376
query25 485 467 409 409
query26 734 279 165 165
query27 2331 508 340 340
query28 3026 2147 2155 2147
query29 598 572 502 502
query30 278 225 189 189
query31 874 877 773 773
query32 71 60 64 60
query33 449 360 322 322
query34 777 959 526 526
query35 828 853 798 798
query36 934 1013 923 923
query37 116 105 84 84
query38 4297 4333 4095 4095
query39 1515 1411 1419 1411
query40 209 116 113 113
query41 65 61 56 56
query42 132 103 100 100
query43 519 512 489 489
query44 1343 829 820 820
query45 177 172 171 171
query46 840 1026 627 627
query47 1844 1854 1781 1781
query48 380 411 310 310
query49 691 503 416 416
query50 652 695 411 411
query51 4220 4308 4270 4270
query52 110 111 99 99
query53 246 264 190 190
query54 586 578 503 503
query55 86 84 102 84
query56 322 311 287 287
query57 1173 1174 1141 1141
query58 260 279 275 275
query59 2813 2829 2913 2829
query60 321 316 309 309
query61 124 123 126 123
query62 746 745 680 680
query63 219 189 186 186
query64 1846 1054 677 677
query65 4388 4221 4246 4221
query66 756 394 298 298
query67 15828 15452 15502 15452
query68 7233 870 505 505
query69 522 300 257 257
query70 1171 1102 1115 1102
query71 490 318 290 290
query72 5740 4560 4777 4560
query73 1545 628 345 345
query74 8838 9208 8618 8618
query75 4157 3163 2692 2692
query76 4168 1192 742 742
query77 763 369 281 281
query78 10037 10239 9194 9194
query79 2467 807 563 563
query80 598 500 434 434
query81 484 258 224 224
query82 451 125 98 98
query83 246 241 231 231
query84 298 96 81 81
query85 767 361 311 311
query86 369 304 298 298
query87 4426 4445 4400 4400
query88 3746 2215 2283 2215
query89 405 312 288 288
query90 1818 213 222 213
query91 143 143 111 111
query92 77 58 57 57
query93 1924 963 584 584
query94 668 412 278 278
query95 369 295 288 288
query96 478 569 273 273
query97 3153 3188 3134 3134
query98 244 206 199 199
query99 1578 1379 1264 1264
Total cold run time: 298188 ms
Total hot run time: 192172 ms
ClickBench: Total hot run time: 29.9 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 6609880d8498cb44a7eea276830a0342a730fe14, data reload: false
query1 0.04 0.04 0.03
query2 0.12 0.10 0.12
query3 0.25 0.20 0.20
query4 1.60 0.19 0.19
query5 0.60 0.58 0.59
query6 1.18 0.73 0.73
query7 0.02 0.01 0.02
query8 0.04 0.04 0.04
query9 0.58 0.53 0.53
query10 0.57 0.57 0.56
query11 0.15 0.10 0.11
query12 0.14 0.11 0.12
query13 0.60 0.60 0.59
query14 1.20 1.17 1.17
query15 0.87 0.86 0.86
query16 0.37 0.39 0.38
query17 1.02 1.06 1.00
query18 0.20 0.20 0.20
query19 1.92 1.76 1.80
query20 0.02 0.01 0.01
query21 15.39 0.89 0.55
query22 0.76 1.25 0.65
query23 14.88 1.36 0.65
query24 7.00 1.47 0.98
query25 0.45 0.17 0.09
query26 0.64 0.16 0.15
query27 0.06 0.05 0.05
query28 9.49 0.89 0.44
query29 12.53 4.08 3.39
query30 0.26 0.09 0.06
query31 2.82 0.59 0.38
query32 3.24 0.54 0.47
query33 2.96 3.01 3.12
query34 15.79 5.10 4.54
query35 4.56 4.54 4.55
query36 0.69 0.49 0.48
query37 0.08 0.06 0.06
query38 0.06 0.04 0.03
query39 0.03 0.02 0.03
query40 0.17 0.14 0.14
query41 0.08 0.03 0.02
query42 0.03 0.02 0.02
query43 0.03 0.03 0.03
Total cold run time: 103.49 s
Total hot run time: 29.9 s
BE UT Coverage Report
Increment line coverage 93.48% (86/92) :tada:
Increment coverage report Complete coverage report
| Category | Coverage |
|---|---|
| Function Coverage | 53.17% (14431/27142) |
| Line Coverage | 42.03% (125055/297504) |
| Region Coverage | 40.86% (63881/156345) |
| Branch Coverage | 35.49% (32120/90494) |
run buildall
TPC-H: Total hot run time: 33807 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 517dc2c79a3ce82477b173156496be613e7520b5, data reload: false
------ Round 1 ----------------------------------
q1 25731 5891 4993 4993
q2 2056 267 190 190
q3 10401 1225 668 668
q4 10228 992 527 527
q5 7526 2340 2322 2322
q6 184 160 130 130
q7 901 733 607 607
q8 9319 1254 1107 1107
q9 6920 5145 5079 5079
q10 6811 2295 1896 1896
q11 467 275 279 275
q12 347 361 219 219
q13 17767 3640 3059 3059
q14 220 216 212 212
q15 534 482 478 478
q16 435 441 393 393
q17 591 842 358 358
q18 7489 7207 7097 7097
q19 1227 949 547 547
q20 325 327 216 216
q21 4078 3559 2476 2476
q22 1068 1017 958 958
Total cold run time: 114625 ms
Total hot run time: 33807 ms
----- Round 2, with runtime_filter_mode=off -----
q1 5037 5030 5028 5028
q2 240 330 238 238
q3 2145 2609 2330 2330
q4 1391 1776 1426 1426
q5 4477 4347 4329 4329
q6 218 169 127 127
q7 1954 1930 1718 1718
q8 2579 2614 2511 2511
q9 7227 7241 7136 7136
q10 2935 3136 2751 2751
q11 570 516 479 479
q12 679 784 627 627
q13 3512 3862 3347 3347
q14 286 287 266 266
q15 514 474 468 468
q16 473 503 462 462
q17 1153 1569 1359 1359
q18 7674 7647 7406 7406
q19 802 846 1016 846
q20 1926 1976 1888 1888
q21 5345 4762 4759 4759
q22 1056 1050 994 994
Total cold run time: 52193 ms
Total hot run time: 50495 ms
TPC-DS: Total hot run time: 191884 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 517dc2c79a3ce82477b173156496be613e7520b5, data reload: false
query1 1413 1087 1074 1074
query2 6155 1844 1844 1844
query3 10997 4637 4327 4327
query4 52321 26189 23346 23346
query5 5007 519 456 456
query6 348 227 196 196
query7 4879 495 282 282
query8 316 249 244 244
query9 5494 2564 2536 2536
query10 419 307 252 252
query11 15071 15069 14784 14784
query12 149 104 106 104
query13 1032 494 377 377
query14 10091 6246 6261 6246
query15 186 193 182 182
query16 7152 662 498 498
query17 1059 720 568 568
query18 1603 402 324 324
query19 191 184 158 158
query20 136 127 114 114
query21 200 122 108 108
query22 4330 4424 4284 4284
query23 33935 33289 33284 33284
query24 6515 2393 2446 2393
query25 465 496 379 379
query26 690 274 158 158
query27 2289 507 337 337
query28 3018 2129 2125 2125
query29 599 584 452 452
query30 274 232 198 198
query31 863 880 811 811
query32 80 66 61 61
query33 467 368 316 316
query34 774 886 523 523
query35 803 843 751 751
query36 954 1024 915 915
query37 115 104 76 76
query38 4222 4225 4110 4110
query39 1498 1475 1445 1445
query40 217 133 102 102
query41 56 57 50 50
query42 117 108 110 108
query43 500 528 491 491
query44 1300 814 806 806
query45 181 175 167 167
query46 847 1024 645 645
query47 1880 1874 1777 1777
query48 391 409 314 314
query49 673 499 405 405
query50 660 689 412 412
query51 4217 4285 4224 4224
query52 106 106 99 99
query53 226 263 183 183
query54 577 588 514 514
query55 86 83 89 83
query56 307 308 294 294
query57 1138 1159 1109 1109
query58 268 252 260 252
query59 2728 2833 2780 2780
query60 338 333 307 307
query61 132 132 132 132
query62 742 762 699 699
query63 223 188 186 186
query64 1537 1122 789 789
query65 4382 4244 4249 4244
query66 797 411 319 319
query67 15873 15589 15192 15192
query68 7517 903 521 521
query69 537 305 274 274
query70 1158 1087 1102 1087
query71 495 313 285 285
query72 5953 4835 4992 4835
query73 1496 672 362 362
query74 8930 9108 8739 8739
query75 3855 3226 2721 2721
query76 4217 1192 782 782
query77 649 357 275 275
query78 9989 10075 9224 9224
query79 2284 817 555 555
query80 713 504 436 436
query81 484 258 217 217
query82 454 124 93 93
query83 252 250 227 227
query84 302 104 91 91
query85 764 359 313 313
query86 374 302 288 288
query87 4430 4469 4335 4335
query88 3648 2229 2210 2210
query89 403 311 283 283
query90 1799 206 206 206
query91 141 146 107 107
query92 75 58 58 58
query93 2020 948 580 580
query94 660 386 301 301
query95 377 304 338 304
query96 480 620 274 274
query97 3127 3270 3132 3132
query98 236 207 196 196
query99 1435 1436 1252 1252
Total cold run time: 295863 ms
Total hot run time: 191884 ms
ClickBench: Total hot run time: 29.71 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 517dc2c79a3ce82477b173156496be613e7520b5, data reload: false
query1 0.04 0.03 0.03
query2 0.13 0.11 0.12
query3 0.26 0.19 0.19
query4 1.59 0.19 0.20
query5 0.58 0.60 0.58
query6 1.18 0.72 0.72
query7 0.03 0.02 0.01
query8 0.04 0.03 0.03
query9 0.57 0.53 0.52
query10 0.57 0.57 0.57
query11 0.15 0.11 0.10
query12 0.15 0.12 0.11
query13 0.61 0.60 0.60
query14 1.16 1.18 1.18
query15 0.88 0.85 0.85
query16 0.38 0.37 0.39
query17 1.03 1.05 1.05
query18 0.21 0.20 0.19
query19 1.87 1.80 1.80
query20 0.01 0.01 0.02
query21 15.42 0.93 0.56
query22 0.74 1.16 0.70
query23 14.95 1.36 0.61
query24 7.61 1.45 0.76
query25 0.46 0.14 0.19
query26 0.52 0.16 0.13
query27 0.06 0.05 0.05
query28 10.10 0.80 0.42
query29 12.57 4.11 3.41
query30 0.25 0.10 0.07
query31 2.82 0.58 0.39
query32 3.23 0.55 0.48
query33 3.06 3.00 3.07
query34 15.69 5.09 4.47
query35 4.54 4.49 4.52
query36 0.67 0.50 0.50
query37 0.08 0.07 0.06
query38 0.06 0.04 0.04
query39 0.04 0.02 0.02
query40 0.17 0.13 0.13
query41 0.08 0.03 0.02
query42 0.03 0.02 0.02
query43 0.04 0.04 0.03
Total cold run time: 104.63 s
Total hot run time: 29.71 s
BE UT Coverage Report
Increment line coverage 93.48% (86/92) :tada:
Increment coverage report Complete coverage report
| Category | Coverage |
|---|---|
| Function Coverage | 53.17% (14431/27142) |
| Line Coverage | 42.03% (125044/297504) |
| Region Coverage | 40.85% (63873/156345) |
| Branch Coverage | 35.48% (32111/90494) |
run buildall
run buildall
TPC-H: Total hot run time: 33774 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 44223bbc93d68b84fbb8ec80e1926cac67caac32, data reload: false
------ Round 1 ----------------------------------
q1 25894 5011 4993 4993
q2 2074 279 178 178
q3 10407 1283 694 694
q4 10225 997 543 543
q5 7531 2262 2372 2262
q6 188 165 132 132
q7 873 752 619 619
q8 9327 1264 1062 1062
q9 6707 5144 5116 5116
q10 6813 2305 1901 1901
q11 472 283 269 269
q12 345 354 218 218
q13 17771 3671 3076 3076
q14 226 220 214 214
q15 530 488 496 488
q16 439 440 391 391
q17 593 861 373 373
q18 7563 7166 7135 7135
q19 1202 961 555 555
q20 327 336 225 225
q21 3923 2652 2386 2386
q22 1041 1005 944 944
Total cold run time: 114471 ms
Total hot run time: 33774 ms
----- Round 2, with runtime_filter_mode=off -----
q1 5117 5098 5071 5071
q2 231 329 236 236
q3 2103 2623 2253 2253
q4 1396 1818 1402 1402
q5 4416 4464 4426 4426
q6 226 184 132 132
q7 1950 1904 1725 1725
q8 2570 2532 2479 2479
q9 7184 7048 7103 7048
q10 2969 3188 2759 2759
q11 564 483 494 483
q12 705 753 597 597
q13 3423 3840 3327 3327
q14 281 296 257 257
q15 519 482 478 478
q16 462 494 467 467
q17 1143 1563 1411 1411
q18 7497 7428 7489 7428
q19 802 811 872 811
q20 1981 2012 1880 1880
q21 5258 4714 4545 4545
q22 1051 1006 943 943
Total cold run time: 51848 ms
Total hot run time: 50158 ms
TPC-DS: Total hot run time: 184837 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 44223bbc93d68b84fbb8ec80e1926cac67caac32, data reload: false
query1 1026 468 517 468
query2 6585 1819 1731 1731
query3 6737 221 213 213
query4 26097 23663 23109 23109
query5 4590 616 463 463
query6 290 226 179 179
query7 4635 485 282 282
query8 294 243 238 238
query9 8649 2546 2572 2546
query10 451 318 263 263
query11 15324 15055 14733 14733
query12 163 111 99 99
query13 1643 500 387 387
query14 9278 5968 5987 5968
query15 202 185 167 167
query16 7251 616 475 475
query17 1183 711 575 575
query18 1969 408 306 306
query19 194 182 153 153
query20 124 121 114 114
query21 217 134 109 109
query22 4064 4183 4176 4176
query23 33779 32966 32946 32946
query24 8449 2348 2369 2348
query25 568 482 398 398
query26 1219 279 157 157
query27 2743 498 335 335
query28 4387 2103 2096 2096
query29 812 552 414 414
query30 283 208 182 182
query31 931 848 761 761
query32 73 65 63 63
query33 560 347 315 315
query34 780 831 516 516
query35 784 805 739 739
query36 928 992 877 877
query37 108 105 75 75
query38 4197 4116 4074 4074
query39 1450 1438 1404 1404
query40 215 119 105 105
query41 57 51 54 51
query42 116 106 108 106
query43 469 503 447 447
query44 1305 793 786 786
query45 176 167 172 167
query46 822 1021 603 603
query47 1794 1826 1760 1760
query48 371 408 294 294
query49 777 514 432 432
query50 620 689 385 385
query51 4164 4094 4039 4039
query52 115 113 102 102
query53 230 247 180 180
query54 563 563 493 493
query55 86 81 81 81
query56 283 304 301 301
query57 1121 1145 1058 1058
query58 258 262 247 247
query59 2509 2700 2485 2485
query60 329 340 288 288
query61 131 144 121 121
query62 794 723 650 650
query63 224 182 185 182
query64 4302 1002 673 673
query65 4294 4193 4215 4193
query66 1126 420 331 331
query67 15823 15514 15257 15257
query68 8394 875 513 513
query69 450 316 256 256
query70 1194 1042 1052 1042
query71 442 311 296 296
query72 5776 4758 4832 4758
query73 709 638 346 346
query74 9019 8997 9058 8997
query75 3949 3192 2659 2659
query76 3691 1177 755 755
query77 790 378 281 281
query78 9981 10066 9214 9214
query79 2422 806 575 575
query80 632 496 438 438
query81 464 244 215 215
query82 455 128 97 97
query83 285 243 228 228
query84 295 100 86 86
query85 783 347 336 336
query86 336 312 293 293
query87 4490 4395 4324 4324
query88 2994 2195 2211 2195
query89 382 315 281 281
query90 1957 217 208 208
query91 140 138 108 108
query92 82 60 60 60
query93 1326 947 582 582
query94 673 407 294 294
query95 385 293 281 281
query96 488 556 276 276
query97 3206 3217 3118 3118
query98 232 205 208 205
query99 1444 1385 1249 1249
Total cold run time: 273957 ms
Total hot run time: 184837 ms
ClickBench: Total hot run time: 29.12 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 44223bbc93d68b84fbb8ec80e1926cac67caac32, data reload: false
query1 0.04 0.04 0.02
query2 0.12 0.11 0.11
query3 0.26 0.20 0.19
query4 1.59 0.20 0.20
query5 0.60 0.59 0.61
query6 1.19 0.72 0.72
query7 0.02 0.02 0.02
query8 0.05 0.03 0.03
query9 0.57 0.52 0.51
query10 0.54 0.57 0.55
query11 0.16 0.10 0.10
query12 0.14 0.11 0.11
query13 0.61 0.60 0.59
query14 1.16 1.16 1.21
query15 0.89 0.85 0.85
query16 0.38 0.39 0.37
query17 1.05 0.99 1.02
query18 0.21 0.20 0.20
query19 1.89 1.82 1.82
query20 0.02 0.00 0.01
query21 15.41 0.88 0.53
query22 0.74 1.24 0.61
query23 15.00 1.37 0.60
query24 6.69 2.14 0.52
query25 0.51 0.06 0.07
query26 0.61 0.17 0.14
query27 0.06 0.04 0.04
query28 10.44 0.90 0.43
query29 12.55 3.95 3.35
query30 0.25 0.09 0.06
query31 2.83 0.58 0.38
query32 3.23 0.54 0.46
query33 2.95 3.06 3.05
query34 15.67 5.06 4.49
query35 4.50 4.52 4.49
query36 0.67 0.50 0.47
query37 0.09 0.06 0.06
query38 0.04 0.04 0.03
query39 0.03 0.02 0.03
query40 0.17 0.13 0.13
query41 0.08 0.03 0.02
query42 0.03 0.03 0.02
query43 0.04 0.03 0.03
Total cold run time: 104.08 s
Total hot run time: 29.12 s
run buildall
TPC-H: Total hot run time: 33719 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2d20046e945313b3d7d44454b224db073133b850, data reload: false
------ Round 1 ----------------------------------
q1 26069 5095 5018 5018
q2 2089 261 181 181
q3 10420 1233 694 694
q4 10208 1008 509 509
q5 7532 2343 2266 2266
q6 188 166 137 137
q7 900 735 611 611
q8 9321 1220 1061 1061
q9 7045 5026 5159 5026
q10 6846 2364 1934 1934
q11 508 298 285 285
q12 362 360 215 215
q13 17770 3688 3106 3106
q14 219 223 205 205
q15 537 487 475 475
q16 435 442 391 391
q17 595 854 367 367
q18 7485 7076 7133 7076
q19 1221 957 524 524
q20 335 334 231 231
q21 4067 3393 2457 2457
q22 1036 1019 950 950
Total cold run time: 115188 ms
Total hot run time: 33719 ms
----- Round 2, with runtime_filter_mode=off -----
q1 5097 5041 5001 5001
q2 245 332 230 230
q3 2144 2616 2302 2302
q4 1353 1760 1376 1376
q5 4446 4372 4368 4368
q6 214 168 133 133
q7 2002 1931 1775 1775
q8 2582 2523 2507 2507
q9 7272 7208 6934 6934
q10 3022 3182 2732 2732
q11 572 500 486 486
q12 672 755 599 599
q13 3558 3913 3292 3292
q14 284 293 274 274
q15 543 488 488 488
q16 502 500 463 463
q17 1114 1496 1374 1374
q18 7692 7551 7405 7405
q19 793 815 1034 815
q20 1973 2110 1839 1839
q21 5143 4812 4662 4662
q22 1101 1028 1034 1028
Total cold run time: 52324 ms
Total hot run time: 50083 ms
TPC-DS: Total hot run time: 192490 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2d20046e945313b3d7d44454b224db073133b850, data reload: false
query1 1425 1076 1055 1055
query2 6146 1807 1791 1791
query3 11012 4461 4451 4451
query4 57728 24496 23463 23463
query5 5156 457 462 457
query6 394 187 186 186
query7 5296 514 280 280
query8 334 244 221 221
query9 7204 2592 2616 2592
query10 442 340 263 263
query11 15300 14979 14934 14934
query12 155 112 102 102
query13 1286 512 397 397
query14 10030 6282 6357 6282
query15 200 198 179 179
query16 7137 638 493 493
query17 1071 703 566 566
query18 1611 402 302 302
query19 190 191 165 165
query20 134 136 119 119
query21 205 118 114 114
query22 4572 4548 4425 4425
query23 34267 33554 33517 33517
query24 6694 2452 2466 2452
query25 485 513 432 432
query26 696 288 164 164
query27 2225 502 344 344
query28 3440 2156 2141 2141
query29 616 577 454 454
query30 286 223 197 197
query31 877 866 794 794
query32 74 66 108 66
query33 487 381 304 304
query34 762 848 538 538
query35 809 826 753 753
query36 965 992 894 894
query37 112 102 74 74
query38 4167 4303 4139 4139
query39 1500 1454 1454 1454
query40 250 119 106 106
query41 54 53 52 52
query42 126 111 114 111
query43 524 518 511 511
query44 1330 830 815 815
query45 188 175 167 167
query46 849 1025 647 647
query47 1800 1867 1803 1803
query48 376 420 302 302
query49 685 527 465 465
query50 673 682 415 415
query51 4227 4234 4155 4155
query52 111 109 104 104
query53 231 265 183 183
query54 583 584 526 526
query55 85 81 81 81
query56 323 291 285 285
query57 1193 1213 1128 1128
query58 261 258 258 258
query59 2740 2735 2765 2735
query60 316 314 303 303
query61 134 126 128 126
query62 733 755 697 697
query63 221 191 216 191
query64 1748 1018 652 652
query65 4328 4204 4243 4204
query66 708 399 304 304
query67 16054 15505 15433 15433
query68 7389 872 515 515
query69 547 306 256 256
query70 1173 1093 1054 1054
query71 504 321 286 286
query72 5525 4709 4832 4709
query73 1171 617 348 348
query74 9222 8834 8671 8671
query75 3840 3229 2697 2697
query76 4290 1193 735 735
query77 612 370 277 277
query78 10050 10182 9295 9295
query79 2836 799 557 557
query80 808 486 433 433
query81 489 254 213 213
query82 500 127 94 94
query83 348 247 231 231
query84 295 105 90 90
query85 790 356 312 312
query86 413 309 268 268
query87 4318 4474 4251 4251
query88 3426 2173 2185 2173
query89 439 320 295 295
query90 1800 235 208 208
query91 135 148 108 108
query92 69 59 59 59
query93 2117 936 571 571
query94 643 402 293 293
query95 366 294 276 276
query96 486 569 268 268
query97 3192 3282 3149 3149
query98 221 212 197 197
query99 1449 1396 1288 1288
Total cold run time: 305804 ms
Total hot run time: 192490 ms
ClickBench: Total hot run time: 29.47 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 2d20046e945313b3d7d44454b224db073133b850, data reload: false
query1 0.03 0.03 0.04
query2 0.12 0.10 0.11
query3 0.25 0.19 0.19
query4 1.59 0.19 0.18
query5 0.59 0.58 0.59
query6 1.19 0.71 0.74
query7 0.02 0.02 0.02
query8 0.05 0.04 0.04
query9 0.57 0.51 0.52
query10 0.56 0.57 0.58
query11 0.15 0.11 0.11
query12 0.14 0.11 0.12
query13 0.61 0.59 0.60
query14 1.17 1.21 1.17
query15 0.86 0.85 0.84
query16 0.40 0.39 0.39
query17 1.05 1.01 1.01
query18 0.21 0.20 0.20
query19 1.95 1.80 1.81
query20 0.01 0.02 0.01
query21 15.40 0.91 0.56
query22 0.76 1.32 0.67
query23 14.76 1.36 0.65
query24 7.46 1.25 0.65
query25 0.49 0.14 0.08
query26 0.58 0.16 0.14
query27 0.05 0.05 0.05
query28 9.89 0.86 0.44
query29 12.61 4.03 3.36
query30 0.25 0.10 0.06
query31 2.82 0.57 0.39
query32 3.23 0.54 0.48
query33 2.95 3.07 3.03
query34 15.79 5.07 4.46
query35 4.50 4.52 4.49
query36 0.69 0.50 0.48
query37 0.09 0.07 0.06
query38 0.05 0.04 0.03
query39 0.03 0.02 0.03
query40 0.16 0.13 0.13
query41 0.08 0.03 0.03
query42 0.04 0.02 0.02
query43 0.03 0.03 0.03
Total cold run time: 104.23 s
Total hot run time: 29.47 s
BE UT Coverage Report
Increment line coverage :tada:
Increment coverage report Complete coverage report
| Category | Coverage |
|---|---|
| Function Coverage | 54.39% (14746/27110) |
| Line Coverage | 43.50% (129082/296737) |
| Region Coverage | 42.20% (65893/156150) |
| Branch Coverage | 36.73% (33199/90388) |
BE Regression P0 && UT Coverage Report
Increment line coverage 92.78% (90/97) :tada:
Increment coverage report Complete coverage report
| Category | Coverage |
|---|---|
| Function Coverage | 55.56% (14785/26613) |
| Line Coverage | 45.23% (134002/296252) |
| Region Coverage | 42.16% (76971/182551) |
| Branch Coverage | 36.18% (37239/102938) |
run buildall
BE UT Coverage Report
Increment line coverage 93.27% (97/104) :tada:
Increment coverage report Complete coverage report
| Category | Coverage |
|---|---|
| Function Coverage | 54.82% (14778/26956) |
| Line Coverage | 43.95% (129703/295127) |
| Region Coverage | 42.65% (66187/155183) |
| Branch Coverage | 37.26% (33413/89678) |
BE Regression P0 && UT Coverage Report
Increment line coverage 93.20% (96/103) :tada:
Increment coverage report Complete coverage report
| Category | Coverage |
|---|---|
| Function Coverage | 57.76% (15282/26458) |
| Line Coverage | 47.74% (140648/294640) |
| Region Coverage | 44.77% (81305/181586) |
| Branch Coverage | 38.68% (39541/102238) |