doris icon indicating copy to clipboard operation
doris copied to clipboard

[fix](hive) support find column separator for hive text format table with 'serialization.format'

Open morningman opened this issue 1 year ago • 7 comments

The column separator of hive text format table may be set by parameter "field.delim" or "serialization.format", we need to support both of them

morningman avatar Jun 30 '24 15:06 morningman

run buildall

morningman avatar Jun 30 '24 15:06 morningman

Thank you for your contribution to Apache Doris. Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website. See Doris Document.

doris-robot avatar Jun 30 '24 15:06 doris-robot

TPC-H: Total hot run time: 39654 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit efb6fb97a5370ea7cdf91bb469c807eeaddb19d8, data reload: false

------ Round 1 ----------------------------------
q1	17628	4323	4269	4269
q2	2014	186	194	186
q3	10468	1183	1085	1085
q4	10177	821	766	766
q5	7480	2648	2601	2601
q6	217	141	139	139
q7	959	588	598	588
q8	9237	2081	2033	2033
q9	8753	6466	6475	6466
q10	8958	3701	3722	3701
q11	465	230	232	230
q12	432	234	230	230
q13	17865	2971	3012	2971
q14	258	235	232	232
q15	531	486	485	485
q16	509	383	370	370
q17	954	672	715	672
q18	7983	7338	7401	7338
q19	5799	1453	1434	1434
q20	680	345	324	324
q21	4895	3256	3202	3202
q22	390	337	332	332
Total cold run time: 116652 ms
Total hot run time: 39654 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4344	4225	4260	4225
q2	372	263	256	256
q3	2952	2764	2865	2764
q4	1991	1737	1686	1686
q5	5579	5544	5481	5481
q6	223	133	131	131
q7	2178	1803	1861	1803
q8	3274	3379	3384	3379
q9	8718	8590	8798	8590
q10	4088	3921	3742	3742
q11	578	492	507	492
q12	799	655	655	655
q13	16094	3163	3199	3163
q14	309	264	285	264
q15	529	506	500	500
q16	473	431	435	431
q17	1810	1545	1516	1516
q18	7983	7920	7739	7739
q19	3840	1582	1582	1582
q20	2102	1864	1852	1852
q21	5161	4883	4980	4883
q22	599	516	554	516
Total cold run time: 73996 ms
Total hot run time: 55650 ms

doris-robot avatar Jun 30 '24 15:06 doris-robot

TPC-DS: Total hot run time: 173924 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit efb6fb97a5370ea7cdf91bb469c807eeaddb19d8, data reload: false

query1	924	403	378	378
query2	6480	2520	2419	2419
query3	6635	201	213	201
query4	19509	17535	17360	17360
query5	3628	478	483	478
query6	276	167	188	167
query7	4602	297	298	297
query8	302	303	282	282
query9	8543	2387	2373	2373
query10	561	309	276	276
query11	10497	9948	9928	9928
query12	117	81	79	79
query13	1633	364	356	356
query14	10035	7860	7011	7011
query15	247	195	180	180
query16	7766	270	260	260
query17	1929	546	519	519
query18	1935	273	260	260
query19	200	148	147	147
query20	90	84	79	79
query21	204	133	125	125
query22	4335	4104	4032	4032
query23	34059	33738	33568	33568
query24	10938	2876	2861	2861
query25	603	383	381	381
query26	1109	150	160	150
query27	2810	321	321	321
query28	7630	2104	2113	2104
query29	908	664	628	628
query30	249	157	161	157
query31	1004	772	760	760
query32	96	56	57	56
query33	749	282	282	282
query34	1051	500	498	498
query35	742	629	637	629
query36	1138	984	996	984
query37	149	82	86	82
query38	2970	2891	2796	2796
query39	902	847	835	835
query40	207	128	128	128
query41	55	52	53	52
query42	109	105	107	105
query43	620	545	531	531
query44	1227	750	728	728
query45	195	161	169	161
query46	1078	729	736	729
query47	1829	1787	1750	1750
query48	364	303	288	288
query49	847	405	418	405
query50	774	392	386	386
query51	6828	6729	6834	6729
query52	108	93	93	93
query53	365	300	291	291
query54	921	437	441	437
query55	74	71	74	71
query56	280	254	262	254
query57	1127	1044	1055	1044
query58	250	261	256	256
query59	3362	3087	3050	3050
query60	314	289	295	289
query61	115	108	106	106
query62	613	444	431	431
query63	326	289	291	289
query64	8941	2342	1855	1855
query65	3179	3108	3110	3108
query66	779	332	335	332
query67	15183	14836	14700	14700
query68	6237	537	533	533
query69	648	433	356	356
query70	1211	1137	1156	1137
query71	492	286	289	286
query72	7776	5755	5827	5755
query73	821	329	325	325
query74	5881	5491	5534	5491
query75	4162	2644	2697	2644
query76	4351	1091	925	925
query77	675	305	302	302
query78	10618	9696	9879	9696
query79	8594	521	522	521
query80	1596	468	467	467
query81	553	225	217	217
query82	757	108	102	102
query83	200	168	163	163
query84	271	84	85	84
query85	1338	337	268	268
query86	462	314	286	286
query87	3297	3117	3069	3069
query88	5266	2387	2352	2352
query89	519	382	399	382
query90	1897	193	194	193
query91	127	98	101	98
query92	65	48	49	48
query93	6731	494	499	494
query94	1103	184	192	184
query95	398	316	312	312
query96	612	275	267	267
query97	3211	3018	3075	3018
query98	213	203	194	194
query99	1208	858	859	858
Total cold run time: 288915 ms
Total hot run time: 173924 ms

doris-robot avatar Jun 30 '24 16:06 doris-robot

ClickBench: Total hot run time: 30.72 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit efb6fb97a5370ea7cdf91bb469c807eeaddb19d8, data reload: false

query1	0.04	0.04	0.04
query2	0.08	0.04	0.04
query3	0.23	0.05	0.06
query4	1.67	0.07	0.07
query5	0.49	0.48	0.49
query6	1.13	0.72	0.72
query7	0.02	0.02	0.01
query8	0.05	0.04	0.04
query9	0.54	0.49	0.49
query10	0.54	0.54	0.54
query11	0.15	0.12	0.12
query12	0.15	0.12	0.12
query13	0.59	0.60	0.59
query14	0.79	0.78	0.77
query15	0.85	0.83	0.82
query16	0.35	0.36	0.37
query17	0.97	1.00	1.03
query18	0.24	0.25	0.22
query19	1.91	1.73	1.74
query20	0.02	0.01	0.01
query21	15.43	0.74	0.66
query22	4.34	7.18	2.18
query23	18.32	1.27	1.23
query24	2.23	0.23	0.21
query25	0.16	0.08	0.07
query26	0.26	0.18	0.17
query27	0.08	0.08	0.08
query28	13.14	1.00	0.98
query29	12.62	3.28	3.23
query30	0.25	0.06	0.06
query31	2.86	0.40	0.38
query32	3.29	0.47	0.48
query33	2.89	2.92	2.89
query34	17.22	4.42	4.40
query35	4.44	4.45	4.46
query36	0.66	0.46	0.47
query37	0.18	0.15	0.15
query38	0.16	0.13	0.14
query39	0.04	0.03	0.03
query40	0.19	0.13	0.14
query41	0.10	0.05	0.05
query42	0.06	0.05	0.05
query43	0.04	0.04	0.04
Total cold run time: 109.77 s
Total hot run time: 30.72 s

doris-robot avatar Jun 30 '24 16:06 doris-robot

PR approved by at least one committer and no changes requested.

github-actions[bot] avatar Jul 01 '24 07:07 github-actions[bot]

PR approved by anyone and no changes requested.

github-actions[bot] avatar Jul 01 '24 07:07 github-actions[bot]

run buildall

morningman avatar Jul 02 '24 02:07 morningman

TPC-H: Total hot run time: 42652 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ef09085428285b280aef45a98e0eb7b86abc5780, data reload: false

------ Round 1 ----------------------------------
q1	17614	5485	4768	4768
q2	2012	199	186	186
q3	10467	1260	1251	1251
q4	10197	832	829	829
q5	7533	2868	2968	2868
q6	247	141	140	140
q7	1040	597	613	597
q8	9238	2286	2263	2263
q9	9150	6923	6919	6919
q10	9101	3899	3927	3899
q11	461	237	240	237
q12	481	238	236	236
q13	17756	2993	2956	2956
q14	282	228	217	217
q15	556	480	494	480
q16	556	375	371	371
q17	1031	686	755	686
q18	8185	7563	7435	7435
q19	6920	1683	1541	1541
q20	683	335	339	335
q21	5145	4075	4255	4075
q22	413	363	368	363
Total cold run time: 119068 ms
Total hot run time: 42652 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4792	4678	4894	4678
q2	427	278	281	278
q3	3357	2962	3135	2962
q4	2131	1695	1696	1695
q5	5615	5654	5755	5654
q6	253	135	129	129
q7	2330	1816	1843	1816
q8	3539	3744	3765	3744
q9	8907	8891	8875	8875
q10	4173	4043	3932	3932
q11	656	505	496	496
q12	843	645	614	614
q13	15934	3150	3169	3150
q14	332	267	276	267
q15	562	491	470	470
q16	494	413	433	413
q17	1979	1636	1574	1574
q18	8333	7936	8004	7936
q19	2022	1742	1905	1742
q20	2631	1888	1859	1859
q21	5508	5069	5315	5069
q22	796	579	550	550
Total cold run time: 75614 ms
Total hot run time: 57903 ms

doris-robot avatar Jul 02 '24 04:07 doris-robot

TPC-DS: Total hot run time: 173888 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ef09085428285b280aef45a98e0eb7b86abc5780, data reload: false

query1	939	378	390	378
query2	6392	2321	2452	2321
query3	6629	214	217	214
query4	18556	17579	17280	17280
query5	3692	488	493	488
query6	254	179	164	164
query7	4597	308	301	301
query8	348	297	306	297
query9	8506	2378	2346	2346
query10	587	302	284	284
query11	10527	10057	10013	10013
query12	117	93	87	87
query13	1648	376	371	371
query14	9933	7889	6960	6960
query15	229	191	198	191
query16	7837	275	282	275
query17	1814	554	531	531
query18	1991	284	277	277
query19	209	159	160	159
query20	99	91	82	82
query21	221	137	132	132
query22	4277	4028	4011	4011
query23	33958	33738	33817	33738
query24	11061	2859	2903	2859
query25	621	412	408	408
query26	730	162	161	161
query27	2306	338	344	338
query28	6068	2137	2163	2137
query29	910	672	657	657
query30	244	191	155	155
query31	986	780	739	739
query32	98	56	58	56
query33	784	320	303	303
query34	1012	494	499	494
query35	726	666	649	649
query36	1138	931	973	931
query37	151	85	86	85
query38	3020	2833	2879	2833
query39	922	845	913	845
query40	212	136	130	130
query41	53	52	57	52
query42	110	98	99	98
query43	625	574	556	556
query44	1197	723	735	723
query45	204	165	159	159
query46	1087	728	713	713
query47	1798	1770	1747	1747
query48	381	302	293	293
query49	861	414	409	409
query50	758	389	388	388
query51	6775	6865	6793	6793
query52	98	93	97	93
query53	363	281	293	281
query54	898	451	452	451
query55	77	73	77	73
query56	279	261	270	261
query57	1128	1062	1044	1044
query58	237	241	250	241
query59	3358	3281	3307	3281
query60	312	277	283	277
query61	93	91	89	89
query62	601	451	454	451
query63	320	291	295	291
query64	8866	2246	1740	1740
query65	3189	3094	3092	3092
query66	744	325	324	324
query67	15750	15034	15174	15034
query68	8081	544	555	544
query69	716	476	333	333
query70	1129	1152	1072	1072
query71	512	284	270	270
query72	8692	4855	5165	4855
query73	834	333	326	326
query74	5866	5504	5558	5504
query75	4885	2640	2678	2640
query76	4604	1014	1000	1000
query77	761	304	300	300
query78	10514	9812	9812	9812
query79	8073	520	521	520
query80	1015	476	476	476
query81	541	218	216	216
query82	738	113	112	112
query83	315	177	169	169
query84	261	85	86	85
query85	1280	272	267	267
query86	426	294	314	294
query87	3284	3103	3076	3076
query88	4918	2404	2367	2367
query89	518	390	390	390
query90	1901	196	190	190
query91	133	98	100	98
query92	68	51	50	50
query93	6064	504	495	495
query94	1152	194	187	187
query95	410	317	319	317
query96	604	275	263	263
query97	3184	3012	3044	3012
query98	207	200	196	196
query99	1139	865	818	818
Total cold run time: 287336 ms
Total hot run time: 173888 ms

doris-robot avatar Jul 02 '24 04:07 doris-robot

ClickBench: Total hot run time: 30.8 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit ef09085428285b280aef45a98e0eb7b86abc5780, data reload: false

query1	0.04	0.03	0.03
query2	0.08	0.05	0.04
query3	0.22	0.04	0.04
query4	1.68	0.07	0.08
query5	0.50	0.50	0.47
query6	1.13	0.73	0.74
query7	0.02	0.01	0.01
query8	0.05	0.04	0.04
query9	0.56	0.49	0.49
query10	0.54	0.54	0.54
query11	0.15	0.11	0.12
query12	0.14	0.12	0.12
query13	0.59	0.60	0.59
query14	0.81	0.79	0.77
query15	0.85	0.82	0.82
query16	0.37	0.36	0.37
query17	0.99	1.00	0.98
query18	0.24	0.24	0.26
query19	1.81	1.68	1.75
query20	0.02	0.01	0.01
query21	15.43	0.77	0.66
query22	4.16	6.63	2.10
query23	18.23	1.40	1.30
query24	2.14	0.22	0.23
query25	0.16	0.08	0.09
query26	0.28	0.17	0.17
query27	0.09	0.08	0.08
query28	13.25	1.02	1.00
query29	12.64	3.32	3.33
query30	0.26	0.06	0.05
query31	2.85	0.40	0.39
query32	3.26	0.49	0.47
query33	2.87	2.96	2.91
query34	17.24	4.37	4.48
query35	4.53	4.47	4.46
query36	0.66	0.48	0.47
query37	0.18	0.16	0.17
query38	0.16	0.15	0.15
query39	0.04	0.04	0.04
query40	0.17	0.14	0.15
query41	0.09	0.04	0.04
query42	0.06	0.04	0.05
query43	0.04	0.04	0.04
Total cold run time: 109.58 s
Total hot run time: 30.8 s

doris-robot avatar Jul 02 '24 04:07 doris-robot

PR approved by at least one committer and no changes requested.

github-actions[bot] avatar Jul 02 '24 15:07 github-actions[bot]