doris icon indicating copy to clipboard operation
doris copied to clipboard

[fix](inverted_index) fix tokenization issues for some characters in ik analyzer

Open Ryan19929 opened this issue 8 months ago • 42 comments

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary: This PR fixes the issue of IK Analyzer's abnormal handling of full-width characters and adds support for Emoji and rare character tokenization, consistent with Elasticsearch IK behavior.

The main content is as follows:

  • Fix the issue of incorrect character classification.
  • Full-width numbers, letters, and punctuation marks will be converted to half-width characters during output.
  • Added SurrogatePairSegmenter for Emoji and rare characters.
  • Increased unit tests and regression tests.
  • Remove some unnecessary code.

Release note

None

Check List (For Author)

  • Test

    • [X] Regression test
    • [X] Unit Test
    • [ ] Manual test (add detailed scripts or steps below)
    • [ ] No need to test or manual test. Explain why:
      • [ ] This is a refactor/code format and no logic has been changed.
      • [ ] Previous test can cover this change.
      • [ ] No code files have been changed.
      • [ ] Other reason
  • Behavior changed:

    • [X] No.
    • [ ] Yes.
  • Does this need documentation?

    • [X] No.
    • [ ] Yes.

Check List (For Reviewer who merge this PR)

  • [ ] Confirm the release note
  • [ ] Confirm test cases
  • [ ] Confirm document
  • [ ] Add branch pick label

Ryan19929 avatar Apr 17 '25 09:04 Ryan19929

Thank you for your contribution to Apache Doris. Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

Thearas avatar Apr 17 '25 09:04 Thearas

run buildall

Ryan19929 avatar Apr 17 '25 09:04 Ryan19929

TPC-H: Total hot run time: 34949 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 5aaf69259ea171f55819df324a30a2fb01b51935, data reload: false

------ Round 1 ----------------------------------
q1	25874	5120	5051	5051
q2	2087	286	184	184
q3	10375	1269	710	710
q4	10248	1042	582	582
q5	7577	2431	2377	2377
q6	190	164	133	133
q7	933	744	623	623
q8	9323	1313	1161	1161
q9	6788	5130	5181	5130
q10	6839	2305	1901	1901
q11	494	289	268	268
q12	353	362	219	219
q13	17778	3699	3071	3071
q14	226	220	206	206
q15	533	499	492	492
q16	442	449	393	393
q17	611	891	388	388
q18	7369	7082	7024	7024
q19	1216	938	580	580
q20	351	356	227	227
q21	4489	3450	3273	3273
q22	1073	1006	956	956
Total cold run time: 115169 ms
Total hot run time: 34949 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5136	5105	5093	5093
q2	245	331	233	233
q3	2222	2660	2292	2292
q4	1465	1843	1483	1483
q5	4564	4448	4345	4345
q6	212	166	125	125
q7	1963	1865	1755	1755
q8	2605	2594	2608	2594
q9	7120	7096	7142	7096
q10	2990	3210	2729	2729
q11	568	513	509	509
q12	701	785	615	615
q13	3455	3807	3328	3328
q14	291	305	270	270
q15	508	491	498	491
q16	466	492	463	463
q17	1139	1640	1389	1389
q18	7710	7509	7475	7475
q19	858	898	1086	898
q20	1965	1983	1867	1867
q21	5400	4662	4689	4662
q22	1091	1043	988	988
Total cold run time: 52674 ms
Total hot run time: 50700 ms

doris-robot avatar Apr 17 '25 10:04 doris-robot

TPC-DS: Total hot run time: 185790 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 5aaf69259ea171f55819df324a30a2fb01b51935, data reload: false

query1	998	461	486	461
query2	6554	1804	1792	1792
query3	6741	231	215	215
query4	26304	23644	23026	23026
query5	4306	642	462	462
query6	292	198	192	192
query7	4630	476	285	285
query8	286	238	222	222
query9	8594	2590	2589	2589
query10	467	342	269	269
query11	15222	15024	14792	14792
query12	154	121	116	116
query13	1672	541	410	410
query14	8845	6331	6331	6331
query15	210	194	175	175
query16	7361	670	518	518
query17	1211	734	585	585
query18	1991	412	315	315
query19	199	192	161	161
query20	121	118	122	118
query21	215	124	106	106
query22	4071	4115	4092	4092
query23	34056	33102	32979	32979
query24	8449	2425	2408	2408
query25	544	471	399	399
query26	1276	261	153	153
query27	2757	500	341	341
query28	4290	2113	2098	2098
query29	777	550	433	433
query30	284	218	188	188
query31	931	847	780	780
query32	72	64	64	64
query33	556	389	322	322
query34	813	849	523	523
query35	810	806	726	726
query36	982	980	922	922
query37	123	103	78	78
query38	4056	4169	4088	4088
query39	1443	1421	1389	1389
query40	215	119	110	110
query41	61	54	53	53
query42	119	102	111	102
query43	498	504	474	474
query44	1324	813	809	809
query45	184	174	167	167
query46	843	1028	636	636
query47	1749	1809	1727	1727
query48	379	411	302	302
query49	768	515	445	445
query50	662	683	409	409
query51	4138	4152	4089	4089
query52	104	105	102	102
query53	234	258	183	183
query54	587	582	512	512
query55	83	89	85	85
query56	337	294	291	291
query57	1115	1138	1066	1066
query58	262	291	255	255
query59	2547	2651	2546	2546
query60	326	331	300	300
query61	131	127	125	125
query62	808	717	678	678
query63	225	188	191	188
query64	4348	1020	713	713
query65	4386	4253	4258	4253
query66	1155	414	308	308
query67	15697	15377	15152	15152
query68	7836	889	524	524
query69	480	301	260	260
query70	1176	1144	1056	1056
query71	459	328	359	328
query72	5770	4677	4687	4677
query73	669	564	349	349
query74	8876	9085	8649	8649
query75	3872	3177	2745	2745
query76	3649	1185	777	777
query77	809	415	283	283
query78	9891	10164	9354	9354
query79	1999	815	563	563
query80	599	521	450	450
query81	468	262	224	224
query82	423	125	98	98
query83	264	247	233	233
query84	245	105	93	93
query85	789	352	328	328
query86	336	299	298	298
query87	4366	4513	4339	4339
query88	3631	2224	2243	2224
query89	387	322	279	279
query90	1918	213	219	213
query91	139	141	111	111
query92	76	60	59	59
query93	1414	938	585	585
query94	670	424	306	306
query95	377	284	278	278
query96	490	555	275	275
query97	3125	3213	3129	3129
query98	242	206	203	203
query99	1464	1410	1297	1297
Total cold run time: 272857 ms
Total hot run time: 185790 ms

doris-robot avatar Apr 17 '25 10:04 doris-robot

ClickBench: Total hot run time: 28.97 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 5aaf69259ea171f55819df324a30a2fb01b51935, data reload: false

query1	0.04	0.03	0.04
query2	0.12	0.10	0.11
query3	0.24	0.19	0.19
query4	1.59	0.19	0.11
query5	0.57	0.56	0.56
query6	1.19	0.70	0.72
query7	0.03	0.02	0.02
query8	0.04	0.04	0.03
query9	0.59	0.52	0.50
query10	0.57	0.56	0.57
query11	0.15	0.11	0.11
query12	0.14	0.11	0.11
query13	0.62	0.59	0.61
query14	1.20	1.18	1.18
query15	0.87	0.84	0.86
query16	0.39	0.38	0.38
query17	1.01	1.05	1.02
query18	0.21	0.20	0.19
query19	1.92	1.78	1.80
query20	0.02	0.01	0.01
query21	15.41	0.91	0.54
query22	0.78	1.31	0.60
query23	14.82	1.39	0.63
query24	7.41	1.69	0.29
query25	0.28	0.15	0.28
query26	0.68	0.16	0.14
query27	0.06	0.05	0.05
query28	9.68	0.86	0.43
query29	12.54	4.05	3.40
query30	0.25	0.09	0.06
query31	2.83	0.59	0.38
query32	3.22	0.55	0.47
query33	3.04	3.05	3.03
query34	15.84	5.12	4.50
query35	4.53	4.50	4.50
query36	0.68	0.49	0.48
query37	0.08	0.06	0.07
query38	0.06	0.04	0.04
query39	0.03	0.02	0.02
query40	0.17	0.14	0.13
query41	0.08	0.02	0.03
query42	0.03	0.02	0.02
query43	0.04	0.03	0.02
Total cold run time: 104.05 s
Total hot run time: 28.97 s

doris-robot avatar Apr 17 '25 10:04 doris-robot

BE UT Coverage Report

Increment line coverage 92.78% (90/97) :tada:

Increment coverage report Complete coverage report

Category Coverage
Function Coverage 53.16% (14425/27136)
Line Coverage 42.04% (125053/297485)
Region Coverage 40.83% (63843/156356)
Branch Coverage 35.48% (32115/90508)

doris-robot avatar Apr 17 '25 11:04 doris-robot

run buildall

Ryan19929 avatar Apr 20 '25 09:04 Ryan19929

TPC-H: Total hot run time: 34240 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 6609880d8498cb44a7eea276830a0342a730fe14, data reload: false

------ Round 1 ----------------------------------
q1	25896	5066	4995	4995
q2	2065	332	201	201
q3	10315	1237	679	679
q4	10235	1002	533	533
q5	7510	2344	2348	2344
q6	176	162	134	134
q7	913	744	616	616
q8	9339	1308	1104	1104
q9	6927	5176	5146	5146
q10	6871	2307	1866	1866
q11	481	286	287	286
q12	357	356	222	222
q13	17782	3648	3068	3068
q14	222	220	210	210
q15	525	490	476	476
q16	442	444	408	408
q17	603	854	358	358
q18	7674	7227	7426	7227
q19	1472	970	590	590
q20	352	352	233	233
q21	4524	3426	2544	2544
q22	1073	1018	1000	1000
Total cold run time: 115754 ms
Total hot run time: 34240 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5179	5059	5085	5059
q2	247	333	236	236
q3	2170	2636	2254	2254
q4	1427	1825	1442	1442
q5	4483	4350	4442	4350
q6	223	171	127	127
q7	2002	1910	1746	1746
q8	2589	2588	2540	2540
q9	7214	7234	7103	7103
q10	2974	3164	2754	2754
q11	562	486	488	486
q12	654	756	615	615
q13	3556	3833	3306	3306
q14	280	304	300	300
q15	528	487	481	481
q16	460	504	470	470
q17	1245	1559	1373	1373
q18	7877	7626	7486	7486
q19	803	820	851	820
q20	1950	1938	1814	1814
q21	5346	4790	4867	4790
q22	1110	1086	1010	1010
Total cold run time: 52879 ms
Total hot run time: 50562 ms

doris-robot avatar Apr 20 '25 10:04 doris-robot

TPC-DS: Total hot run time: 192172 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 6609880d8498cb44a7eea276830a0342a730fe14, data reload: false

query1	1373	1106	1033	1033
query2	6229	1887	1858	1858
query3	11168	4548	4398	4398
query4	53704	24931	23403	23403
query5	4958	638	438	438
query6	339	211	186	186
query7	4878	501	283	283
query8	297	236	229	229
query9	5276	2554	2559	2554
query10	450	321	261	261
query11	15088	15031	14854	14854
query12	157	105	99	99
query13	1020	495	377	377
query14	10093	6275	6369	6275
query15	196	213	183	183
query16	7062	668	540	540
query17	1091	751	585	585
query18	1550	416	327	327
query19	217	200	176	176
query20	138	125	128	125
query21	211	129	107	107
query22	4317	4465	4335	4335
query23	34030	33378	33627	33378
query24	6764	2376	2440	2376
query25	485	467	409	409
query26	734	279	165	165
query27	2331	508	340	340
query28	3026	2147	2155	2147
query29	598	572	502	502
query30	278	225	189	189
query31	874	877	773	773
query32	71	60	64	60
query33	449	360	322	322
query34	777	959	526	526
query35	828	853	798	798
query36	934	1013	923	923
query37	116	105	84	84
query38	4297	4333	4095	4095
query39	1515	1411	1419	1411
query40	209	116	113	113
query41	65	61	56	56
query42	132	103	100	100
query43	519	512	489	489
query44	1343	829	820	820
query45	177	172	171	171
query46	840	1026	627	627
query47	1844	1854	1781	1781
query48	380	411	310	310
query49	691	503	416	416
query50	652	695	411	411
query51	4220	4308	4270	4270
query52	110	111	99	99
query53	246	264	190	190
query54	586	578	503	503
query55	86	84	102	84
query56	322	311	287	287
query57	1173	1174	1141	1141
query58	260	279	275	275
query59	2813	2829	2913	2829
query60	321	316	309	309
query61	124	123	126	123
query62	746	745	680	680
query63	219	189	186	186
query64	1846	1054	677	677
query65	4388	4221	4246	4221
query66	756	394	298	298
query67	15828	15452	15502	15452
query68	7233	870	505	505
query69	522	300	257	257
query70	1171	1102	1115	1102
query71	490	318	290	290
query72	5740	4560	4777	4560
query73	1545	628	345	345
query74	8838	9208	8618	8618
query75	4157	3163	2692	2692
query76	4168	1192	742	742
query77	763	369	281	281
query78	10037	10239	9194	9194
query79	2467	807	563	563
query80	598	500	434	434
query81	484	258	224	224
query82	451	125	98	98
query83	246	241	231	231
query84	298	96	81	81
query85	767	361	311	311
query86	369	304	298	298
query87	4426	4445	4400	4400
query88	3746	2215	2283	2215
query89	405	312	288	288
query90	1818	213	222	213
query91	143	143	111	111
query92	77	58	57	57
query93	1924	963	584	584
query94	668	412	278	278
query95	369	295	288	288
query96	478	569	273	273
query97	3153	3188	3134	3134
query98	244	206	199	199
query99	1578	1379	1264	1264
Total cold run time: 298188 ms
Total hot run time: 192172 ms

doris-robot avatar Apr 20 '25 10:04 doris-robot

ClickBench: Total hot run time: 29.9 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 6609880d8498cb44a7eea276830a0342a730fe14, data reload: false

query1	0.04	0.04	0.03
query2	0.12	0.10	0.12
query3	0.25	0.20	0.20
query4	1.60	0.19	0.19
query5	0.60	0.58	0.59
query6	1.18	0.73	0.73
query7	0.02	0.01	0.02
query8	0.04	0.04	0.04
query9	0.58	0.53	0.53
query10	0.57	0.57	0.56
query11	0.15	0.10	0.11
query12	0.14	0.11	0.12
query13	0.60	0.60	0.59
query14	1.20	1.17	1.17
query15	0.87	0.86	0.86
query16	0.37	0.39	0.38
query17	1.02	1.06	1.00
query18	0.20	0.20	0.20
query19	1.92	1.76	1.80
query20	0.02	0.01	0.01
query21	15.39	0.89	0.55
query22	0.76	1.25	0.65
query23	14.88	1.36	0.65
query24	7.00	1.47	0.98
query25	0.45	0.17	0.09
query26	0.64	0.16	0.15
query27	0.06	0.05	0.05
query28	9.49	0.89	0.44
query29	12.53	4.08	3.39
query30	0.26	0.09	0.06
query31	2.82	0.59	0.38
query32	3.24	0.54	0.47
query33	2.96	3.01	3.12
query34	15.79	5.10	4.54
query35	4.56	4.54	4.55
query36	0.69	0.49	0.48
query37	0.08	0.06	0.06
query38	0.06	0.04	0.03
query39	0.03	0.02	0.03
query40	0.17	0.14	0.14
query41	0.08	0.03	0.02
query42	0.03	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 103.49 s
Total hot run time: 29.9 s

doris-robot avatar Apr 20 '25 10:04 doris-robot

BE UT Coverage Report

Increment line coverage 93.48% (86/92) :tada:

Increment coverage report Complete coverage report

Category Coverage
Function Coverage 53.17% (14431/27142)
Line Coverage 42.03% (125055/297504)
Region Coverage 40.86% (63881/156345)
Branch Coverage 35.49% (32120/90494)

hello-stephen avatar Apr 20 '25 11:04 hello-stephen

run buildall

Ryan19929 avatar Apr 21 '25 00:04 Ryan19929

TPC-H: Total hot run time: 33807 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 517dc2c79a3ce82477b173156496be613e7520b5, data reload: false

------ Round 1 ----------------------------------
q1	25731	5891	4993	4993
q2	2056	267	190	190
q3	10401	1225	668	668
q4	10228	992	527	527
q5	7526	2340	2322	2322
q6	184	160	130	130
q7	901	733	607	607
q8	9319	1254	1107	1107
q9	6920	5145	5079	5079
q10	6811	2295	1896	1896
q11	467	275	279	275
q12	347	361	219	219
q13	17767	3640	3059	3059
q14	220	216	212	212
q15	534	482	478	478
q16	435	441	393	393
q17	591	842	358	358
q18	7489	7207	7097	7097
q19	1227	949	547	547
q20	325	327	216	216
q21	4078	3559	2476	2476
q22	1068	1017	958	958
Total cold run time: 114625 ms
Total hot run time: 33807 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5037	5030	5028	5028
q2	240	330	238	238
q3	2145	2609	2330	2330
q4	1391	1776	1426	1426
q5	4477	4347	4329	4329
q6	218	169	127	127
q7	1954	1930	1718	1718
q8	2579	2614	2511	2511
q9	7227	7241	7136	7136
q10	2935	3136	2751	2751
q11	570	516	479	479
q12	679	784	627	627
q13	3512	3862	3347	3347
q14	286	287	266	266
q15	514	474	468	468
q16	473	503	462	462
q17	1153	1569	1359	1359
q18	7674	7647	7406	7406
q19	802	846	1016	846
q20	1926	1976	1888	1888
q21	5345	4762	4759	4759
q22	1056	1050	994	994
Total cold run time: 52193 ms
Total hot run time: 50495 ms

doris-robot avatar Apr 21 '25 01:04 doris-robot

TPC-DS: Total hot run time: 191884 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 517dc2c79a3ce82477b173156496be613e7520b5, data reload: false

query1	1413	1087	1074	1074
query2	6155	1844	1844	1844
query3	10997	4637	4327	4327
query4	52321	26189	23346	23346
query5	5007	519	456	456
query6	348	227	196	196
query7	4879	495	282	282
query8	316	249	244	244
query9	5494	2564	2536	2536
query10	419	307	252	252
query11	15071	15069	14784	14784
query12	149	104	106	104
query13	1032	494	377	377
query14	10091	6246	6261	6246
query15	186	193	182	182
query16	7152	662	498	498
query17	1059	720	568	568
query18	1603	402	324	324
query19	191	184	158	158
query20	136	127	114	114
query21	200	122	108	108
query22	4330	4424	4284	4284
query23	33935	33289	33284	33284
query24	6515	2393	2446	2393
query25	465	496	379	379
query26	690	274	158	158
query27	2289	507	337	337
query28	3018	2129	2125	2125
query29	599	584	452	452
query30	274	232	198	198
query31	863	880	811	811
query32	80	66	61	61
query33	467	368	316	316
query34	774	886	523	523
query35	803	843	751	751
query36	954	1024	915	915
query37	115	104	76	76
query38	4222	4225	4110	4110
query39	1498	1475	1445	1445
query40	217	133	102	102
query41	56	57	50	50
query42	117	108	110	108
query43	500	528	491	491
query44	1300	814	806	806
query45	181	175	167	167
query46	847	1024	645	645
query47	1880	1874	1777	1777
query48	391	409	314	314
query49	673	499	405	405
query50	660	689	412	412
query51	4217	4285	4224	4224
query52	106	106	99	99
query53	226	263	183	183
query54	577	588	514	514
query55	86	83	89	83
query56	307	308	294	294
query57	1138	1159	1109	1109
query58	268	252	260	252
query59	2728	2833	2780	2780
query60	338	333	307	307
query61	132	132	132	132
query62	742	762	699	699
query63	223	188	186	186
query64	1537	1122	789	789
query65	4382	4244	4249	4244
query66	797	411	319	319
query67	15873	15589	15192	15192
query68	7517	903	521	521
query69	537	305	274	274
query70	1158	1087	1102	1087
query71	495	313	285	285
query72	5953	4835	4992	4835
query73	1496	672	362	362
query74	8930	9108	8739	8739
query75	3855	3226	2721	2721
query76	4217	1192	782	782
query77	649	357	275	275
query78	9989	10075	9224	9224
query79	2284	817	555	555
query80	713	504	436	436
query81	484	258	217	217
query82	454	124	93	93
query83	252	250	227	227
query84	302	104	91	91
query85	764	359	313	313
query86	374	302	288	288
query87	4430	4469	4335	4335
query88	3648	2229	2210	2210
query89	403	311	283	283
query90	1799	206	206	206
query91	141	146	107	107
query92	75	58	58	58
query93	2020	948	580	580
query94	660	386	301	301
query95	377	304	338	304
query96	480	620	274	274
query97	3127	3270	3132	3132
query98	236	207	196	196
query99	1435	1436	1252	1252
Total cold run time: 295863 ms
Total hot run time: 191884 ms

doris-robot avatar Apr 21 '25 01:04 doris-robot

ClickBench: Total hot run time: 29.71 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 517dc2c79a3ce82477b173156496be613e7520b5, data reload: false

query1	0.04	0.03	0.03
query2	0.13	0.11	0.12
query3	0.26	0.19	0.19
query4	1.59	0.19	0.20
query5	0.58	0.60	0.58
query6	1.18	0.72	0.72
query7	0.03	0.02	0.01
query8	0.04	0.03	0.03
query9	0.57	0.53	0.52
query10	0.57	0.57	0.57
query11	0.15	0.11	0.10
query12	0.15	0.12	0.11
query13	0.61	0.60	0.60
query14	1.16	1.18	1.18
query15	0.88	0.85	0.85
query16	0.38	0.37	0.39
query17	1.03	1.05	1.05
query18	0.21	0.20	0.19
query19	1.87	1.80	1.80
query20	0.01	0.01	0.02
query21	15.42	0.93	0.56
query22	0.74	1.16	0.70
query23	14.95	1.36	0.61
query24	7.61	1.45	0.76
query25	0.46	0.14	0.19
query26	0.52	0.16	0.13
query27	0.06	0.05	0.05
query28	10.10	0.80	0.42
query29	12.57	4.11	3.41
query30	0.25	0.10	0.07
query31	2.82	0.58	0.39
query32	3.23	0.55	0.48
query33	3.06	3.00	3.07
query34	15.69	5.09	4.47
query35	4.54	4.49	4.52
query36	0.67	0.50	0.50
query37	0.08	0.07	0.06
query38	0.06	0.04	0.04
query39	0.04	0.02	0.02
query40	0.17	0.13	0.13
query41	0.08	0.03	0.02
query42	0.03	0.02	0.02
query43	0.04	0.04	0.03
Total cold run time: 104.63 s
Total hot run time: 29.71 s

doris-robot avatar Apr 21 '25 01:04 doris-robot

BE UT Coverage Report

Increment line coverage 93.48% (86/92) :tada:

Increment coverage report Complete coverage report

Category Coverage
Function Coverage 53.17% (14431/27142)
Line Coverage 42.03% (125044/297504)
Region Coverage 40.85% (63873/156345)
Branch Coverage 35.48% (32111/90494)

hello-stephen avatar Apr 21 '25 01:04 hello-stephen

run buildall

Ryan19929 avatar Apr 21 '25 14:04 Ryan19929

run buildall

Ryan19929 avatar Apr 21 '25 14:04 Ryan19929

TPC-H: Total hot run time: 33774 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 44223bbc93d68b84fbb8ec80e1926cac67caac32, data reload: false

------ Round 1 ----------------------------------
q1	25894	5011	4993	4993
q2	2074	279	178	178
q3	10407	1283	694	694
q4	10225	997	543	543
q5	7531	2262	2372	2262
q6	188	165	132	132
q7	873	752	619	619
q8	9327	1264	1062	1062
q9	6707	5144	5116	5116
q10	6813	2305	1901	1901
q11	472	283	269	269
q12	345	354	218	218
q13	17771	3671	3076	3076
q14	226	220	214	214
q15	530	488	496	488
q16	439	440	391	391
q17	593	861	373	373
q18	7563	7166	7135	7135
q19	1202	961	555	555
q20	327	336	225	225
q21	3923	2652	2386	2386
q22	1041	1005	944	944
Total cold run time: 114471 ms
Total hot run time: 33774 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5117	5098	5071	5071
q2	231	329	236	236
q3	2103	2623	2253	2253
q4	1396	1818	1402	1402
q5	4416	4464	4426	4426
q6	226	184	132	132
q7	1950	1904	1725	1725
q8	2570	2532	2479	2479
q9	7184	7048	7103	7048
q10	2969	3188	2759	2759
q11	564	483	494	483
q12	705	753	597	597
q13	3423	3840	3327	3327
q14	281	296	257	257
q15	519	482	478	478
q16	462	494	467	467
q17	1143	1563	1411	1411
q18	7497	7428	7489	7428
q19	802	811	872	811
q20	1981	2012	1880	1880
q21	5258	4714	4545	4545
q22	1051	1006	943	943
Total cold run time: 51848 ms
Total hot run time: 50158 ms

doris-robot avatar Apr 21 '25 15:04 doris-robot

TPC-DS: Total hot run time: 184837 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 44223bbc93d68b84fbb8ec80e1926cac67caac32, data reload: false

query1	1026	468	517	468
query2	6585	1819	1731	1731
query3	6737	221	213	213
query4	26097	23663	23109	23109
query5	4590	616	463	463
query6	290	226	179	179
query7	4635	485	282	282
query8	294	243	238	238
query9	8649	2546	2572	2546
query10	451	318	263	263
query11	15324	15055	14733	14733
query12	163	111	99	99
query13	1643	500	387	387
query14	9278	5968	5987	5968
query15	202	185	167	167
query16	7251	616	475	475
query17	1183	711	575	575
query18	1969	408	306	306
query19	194	182	153	153
query20	124	121	114	114
query21	217	134	109	109
query22	4064	4183	4176	4176
query23	33779	32966	32946	32946
query24	8449	2348	2369	2348
query25	568	482	398	398
query26	1219	279	157	157
query27	2743	498	335	335
query28	4387	2103	2096	2096
query29	812	552	414	414
query30	283	208	182	182
query31	931	848	761	761
query32	73	65	63	63
query33	560	347	315	315
query34	780	831	516	516
query35	784	805	739	739
query36	928	992	877	877
query37	108	105	75	75
query38	4197	4116	4074	4074
query39	1450	1438	1404	1404
query40	215	119	105	105
query41	57	51	54	51
query42	116	106	108	106
query43	469	503	447	447
query44	1305	793	786	786
query45	176	167	172	167
query46	822	1021	603	603
query47	1794	1826	1760	1760
query48	371	408	294	294
query49	777	514	432	432
query50	620	689	385	385
query51	4164	4094	4039	4039
query52	115	113	102	102
query53	230	247	180	180
query54	563	563	493	493
query55	86	81	81	81
query56	283	304	301	301
query57	1121	1145	1058	1058
query58	258	262	247	247
query59	2509	2700	2485	2485
query60	329	340	288	288
query61	131	144	121	121
query62	794	723	650	650
query63	224	182	185	182
query64	4302	1002	673	673
query65	4294	4193	4215	4193
query66	1126	420	331	331
query67	15823	15514	15257	15257
query68	8394	875	513	513
query69	450	316	256	256
query70	1194	1042	1052	1042
query71	442	311	296	296
query72	5776	4758	4832	4758
query73	709	638	346	346
query74	9019	8997	9058	8997
query75	3949	3192	2659	2659
query76	3691	1177	755	755
query77	790	378	281	281
query78	9981	10066	9214	9214
query79	2422	806	575	575
query80	632	496	438	438
query81	464	244	215	215
query82	455	128	97	97
query83	285	243	228	228
query84	295	100	86	86
query85	783	347	336	336
query86	336	312	293	293
query87	4490	4395	4324	4324
query88	2994	2195	2211	2195
query89	382	315	281	281
query90	1957	217	208	208
query91	140	138	108	108
query92	82	60	60	60
query93	1326	947	582	582
query94	673	407	294	294
query95	385	293	281	281
query96	488	556	276	276
query97	3206	3217	3118	3118
query98	232	205	208	205
query99	1444	1385	1249	1249
Total cold run time: 273957 ms
Total hot run time: 184837 ms

doris-robot avatar Apr 21 '25 15:04 doris-robot

ClickBench: Total hot run time: 29.12 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 44223bbc93d68b84fbb8ec80e1926cac67caac32, data reload: false

query1	0.04	0.04	0.02
query2	0.12	0.11	0.11
query3	0.26	0.20	0.19
query4	1.59	0.20	0.20
query5	0.60	0.59	0.61
query6	1.19	0.72	0.72
query7	0.02	0.02	0.02
query8	0.05	0.03	0.03
query9	0.57	0.52	0.51
query10	0.54	0.57	0.55
query11	0.16	0.10	0.10
query12	0.14	0.11	0.11
query13	0.61	0.60	0.59
query14	1.16	1.16	1.21
query15	0.89	0.85	0.85
query16	0.38	0.39	0.37
query17	1.05	0.99	1.02
query18	0.21	0.20	0.20
query19	1.89	1.82	1.82
query20	0.02	0.00	0.01
query21	15.41	0.88	0.53
query22	0.74	1.24	0.61
query23	15.00	1.37	0.60
query24	6.69	2.14	0.52
query25	0.51	0.06	0.07
query26	0.61	0.17	0.14
query27	0.06	0.04	0.04
query28	10.44	0.90	0.43
query29	12.55	3.95	3.35
query30	0.25	0.09	0.06
query31	2.83	0.58	0.38
query32	3.23	0.54	0.46
query33	2.95	3.06	3.05
query34	15.67	5.06	4.49
query35	4.50	4.52	4.49
query36	0.67	0.50	0.47
query37	0.09	0.06	0.06
query38	0.04	0.04	0.03
query39	0.03	0.02	0.03
query40	0.17	0.13	0.13
query41	0.08	0.03	0.02
query42	0.03	0.03	0.02
query43	0.04	0.03	0.03
Total cold run time: 104.08 s
Total hot run time: 29.12 s

doris-robot avatar Apr 21 '25 15:04 doris-robot

run buildall

Ryan19929 avatar Apr 28 '25 00:04 Ryan19929

TPC-H: Total hot run time: 33719 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2d20046e945313b3d7d44454b224db073133b850, data reload: false

------ Round 1 ----------------------------------
q1	26069	5095	5018	5018
q2	2089	261	181	181
q3	10420	1233	694	694
q4	10208	1008	509	509
q5	7532	2343	2266	2266
q6	188	166	137	137
q7	900	735	611	611
q8	9321	1220	1061	1061
q9	7045	5026	5159	5026
q10	6846	2364	1934	1934
q11	508	298	285	285
q12	362	360	215	215
q13	17770	3688	3106	3106
q14	219	223	205	205
q15	537	487	475	475
q16	435	442	391	391
q17	595	854	367	367
q18	7485	7076	7133	7076
q19	1221	957	524	524
q20	335	334	231	231
q21	4067	3393	2457	2457
q22	1036	1019	950	950
Total cold run time: 115188 ms
Total hot run time: 33719 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5097	5041	5001	5001
q2	245	332	230	230
q3	2144	2616	2302	2302
q4	1353	1760	1376	1376
q5	4446	4372	4368	4368
q6	214	168	133	133
q7	2002	1931	1775	1775
q8	2582	2523	2507	2507
q9	7272	7208	6934	6934
q10	3022	3182	2732	2732
q11	572	500	486	486
q12	672	755	599	599
q13	3558	3913	3292	3292
q14	284	293	274	274
q15	543	488	488	488
q16	502	500	463	463
q17	1114	1496	1374	1374
q18	7692	7551	7405	7405
q19	793	815	1034	815
q20	1973	2110	1839	1839
q21	5143	4812	4662	4662
q22	1101	1028	1034	1028
Total cold run time: 52324 ms
Total hot run time: 50083 ms

doris-robot avatar Apr 28 '25 01:04 doris-robot

TPC-DS: Total hot run time: 192490 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2d20046e945313b3d7d44454b224db073133b850, data reload: false

query1	1425	1076	1055	1055
query2	6146	1807	1791	1791
query3	11012	4461	4451	4451
query4	57728	24496	23463	23463
query5	5156	457	462	457
query6	394	187	186	186
query7	5296	514	280	280
query8	334	244	221	221
query9	7204	2592	2616	2592
query10	442	340	263	263
query11	15300	14979	14934	14934
query12	155	112	102	102
query13	1286	512	397	397
query14	10030	6282	6357	6282
query15	200	198	179	179
query16	7137	638	493	493
query17	1071	703	566	566
query18	1611	402	302	302
query19	190	191	165	165
query20	134	136	119	119
query21	205	118	114	114
query22	4572	4548	4425	4425
query23	34267	33554	33517	33517
query24	6694	2452	2466	2452
query25	485	513	432	432
query26	696	288	164	164
query27	2225	502	344	344
query28	3440	2156	2141	2141
query29	616	577	454	454
query30	286	223	197	197
query31	877	866	794	794
query32	74	66	108	66
query33	487	381	304	304
query34	762	848	538	538
query35	809	826	753	753
query36	965	992	894	894
query37	112	102	74	74
query38	4167	4303	4139	4139
query39	1500	1454	1454	1454
query40	250	119	106	106
query41	54	53	52	52
query42	126	111	114	111
query43	524	518	511	511
query44	1330	830	815	815
query45	188	175	167	167
query46	849	1025	647	647
query47	1800	1867	1803	1803
query48	376	420	302	302
query49	685	527	465	465
query50	673	682	415	415
query51	4227	4234	4155	4155
query52	111	109	104	104
query53	231	265	183	183
query54	583	584	526	526
query55	85	81	81	81
query56	323	291	285	285
query57	1193	1213	1128	1128
query58	261	258	258	258
query59	2740	2735	2765	2735
query60	316	314	303	303
query61	134	126	128	126
query62	733	755	697	697
query63	221	191	216	191
query64	1748	1018	652	652
query65	4328	4204	4243	4204
query66	708	399	304	304
query67	16054	15505	15433	15433
query68	7389	872	515	515
query69	547	306	256	256
query70	1173	1093	1054	1054
query71	504	321	286	286
query72	5525	4709	4832	4709
query73	1171	617	348	348
query74	9222	8834	8671	8671
query75	3840	3229	2697	2697
query76	4290	1193	735	735
query77	612	370	277	277
query78	10050	10182	9295	9295
query79	2836	799	557	557
query80	808	486	433	433
query81	489	254	213	213
query82	500	127	94	94
query83	348	247	231	231
query84	295	105	90	90
query85	790	356	312	312
query86	413	309	268	268
query87	4318	4474	4251	4251
query88	3426	2173	2185	2173
query89	439	320	295	295
query90	1800	235	208	208
query91	135	148	108	108
query92	69	59	59	59
query93	2117	936	571	571
query94	643	402	293	293
query95	366	294	276	276
query96	486	569	268	268
query97	3192	3282	3149	3149
query98	221	212	197	197
query99	1449	1396	1288	1288
Total cold run time: 305804 ms
Total hot run time: 192490 ms

doris-robot avatar Apr 28 '25 01:04 doris-robot

ClickBench: Total hot run time: 29.47 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 2d20046e945313b3d7d44454b224db073133b850, data reload: false

query1	0.03	0.03	0.04
query2	0.12	0.10	0.11
query3	0.25	0.19	0.19
query4	1.59	0.19	0.18
query5	0.59	0.58	0.59
query6	1.19	0.71	0.74
query7	0.02	0.02	0.02
query8	0.05	0.04	0.04
query9	0.57	0.51	0.52
query10	0.56	0.57	0.58
query11	0.15	0.11	0.11
query12	0.14	0.11	0.12
query13	0.61	0.59	0.60
query14	1.17	1.21	1.17
query15	0.86	0.85	0.84
query16	0.40	0.39	0.39
query17	1.05	1.01	1.01
query18	0.21	0.20	0.20
query19	1.95	1.80	1.81
query20	0.01	0.02	0.01
query21	15.40	0.91	0.56
query22	0.76	1.32	0.67
query23	14.76	1.36	0.65
query24	7.46	1.25	0.65
query25	0.49	0.14	0.08
query26	0.58	0.16	0.14
query27	0.05	0.05	0.05
query28	9.89	0.86	0.44
query29	12.61	4.03	3.36
query30	0.25	0.10	0.06
query31	2.82	0.57	0.39
query32	3.23	0.54	0.48
query33	2.95	3.07	3.03
query34	15.79	5.07	4.46
query35	4.50	4.52	4.49
query36	0.69	0.50	0.48
query37	0.09	0.07	0.06
query38	0.05	0.04	0.03
query39	0.03	0.02	0.03
query40	0.16	0.13	0.13
query41	0.08	0.03	0.03
query42	0.04	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 104.23 s
Total hot run time: 29.47 s

doris-robot avatar Apr 28 '25 01:04 doris-robot

BE UT Coverage Report

Increment line coverage :tada:

Increment coverage report Complete coverage report

Category Coverage
Function Coverage 54.39% (14746/27110)
Line Coverage 43.50% (129082/296737)
Region Coverage 42.20% (65893/156150)
Branch Coverage 36.73% (33199/90388)

doris-robot avatar Apr 28 '25 02:04 doris-robot

BE Regression P0 && UT Coverage Report

Increment line coverage 92.78% (90/97) :tada:

Increment coverage report Complete coverage report

Category Coverage
Function Coverage 55.56% (14785/26613)
Line Coverage 45.23% (134002/296252)
Region Coverage 42.16% (76971/182551)
Branch Coverage 36.18% (37239/102938)

hello-stephen avatar Apr 28 '25 03:04 hello-stephen

run buildall

Ryan19929 avatar Apr 29 '25 10:04 Ryan19929

BE UT Coverage Report

Increment line coverage 93.27% (97/104) :tada:

Increment coverage report Complete coverage report

Category Coverage
Function Coverage 54.82% (14778/26956)
Line Coverage 43.95% (129703/295127)
Region Coverage 42.65% (66187/155183)
Branch Coverage 37.26% (33413/89678)

doris-robot avatar Apr 29 '25 14:04 doris-robot

BE Regression P0 && UT Coverage Report

Increment line coverage 93.20% (96/103) :tada:

Increment coverage report Complete coverage report

Category Coverage
Function Coverage 57.76% (15282/26458)
Line Coverage 47.74% (140648/294640)
Region Coverage 44.77% (81305/181586)
Branch Coverage 38.68% (39541/102238)

hello-stephen avatar Apr 29 '25 18:04 hello-stephen