doris icon indicating copy to clipboard operation
doris copied to clipboard

[enhancement](profile) Store profile on disk so that we can hold more profile in memory

Open zhiqiang-hhhh opened this issue 10 months ago • 29 comments

Step 3 of https://github.com/apache/doris/issues/33744

Store profile on disk of FE, in the same path with audit log.

  1. ProfileManager will have a daemon thread that checks whether a profile should be stored on disk
  2. Which profile can be stored? a. query finished. b. collection of profile has finished or query itself has finished for a long time, like 5 seconds
  3. Profile structure on disk:
* Integer: n(size of summary profile)
* String: json of summary profile
* Integer: m(size of execution profile)
* String: raw text of execution profile
  1. IO thread of ProfileManager will also remove garbage from disk.
  2. Once a profile is stored to disk, its detail content will be release from memory, so that we can hold more profiles in memory, further access to profile will read from disk directly.
  3. Basic but necessary UT for profile serialization.
  4. Refine constructor of ExecutionProfile.

Further work:

  1. Abstract class ProfileReader/ProfileWriter, they define common interface for profile io, and we can implement ProfileDiskReader/ProfileDiskWriter and also ProfileS3Reader/ProfileS3Writer to support profile io on object storage.
  2. Finer granularity for profile creation. Currently for ddl like create/drop table, profile is also created, this is unnecessary.
  3. More regression test.
  4. More reasonable default value of max_profile_on_disk.
  5. system table for profile storage, so that we can figure out how much storage is costed for profile

zhiqiang-hhhh avatar Apr 16 '24 02:04 zhiqiang-hhhh

Thank you for your contribution to Apache Doris. Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website. See Doris Document.

doris-robot avatar Apr 16 '24 02:04 doris-robot

clang-tidy review says "All clean, LGTM! :+1:"

github-actions[bot] avatar Apr 17 '24 06:04 github-actions[bot]

clang-tidy review says "All clean, LGTM! :+1:"

github-actions[bot] avatar Apr 18 '24 08:04 github-actions[bot]

clang-tidy review says "All clean, LGTM! :+1:"

github-actions[bot] avatar Apr 19 '24 04:04 github-actions[bot]

clang-tidy review says "All clean, LGTM! :+1:"

github-actions[bot] avatar Apr 19 '24 06:04 github-actions[bot]

clang-tidy review says "All clean, LGTM! :+1:"

github-actions[bot] avatar Apr 19 '24 10:04 github-actions[bot]

clang-tidy review says "All clean, LGTM! :+1:"

github-actions[bot] avatar Apr 19 '24 12:04 github-actions[bot]

run buildall

zhiqiang-hhhh avatar Apr 21 '24 12:04 zhiqiang-hhhh

TeamCity be ut coverage result: Function Coverage: 35.38% (8918/25204) Line Coverage: 27.10% (73307/270517) Region Coverage: 26.24% (37882/144355) Branch Coverage: 23.05% (19290/83682) Coverage Report: http://coverage.selectdb-in.cc/coverage/b6871f379edeea67efe32aed98fe3c64b5fb0aa0_b6871f379edeea67efe32aed98fe3c64b5fb0aa0/report/index.html

doris-robot avatar Apr 21 '24 12:04 doris-robot

run buildall

zhiqiang-hhhh avatar Apr 23 '24 13:04 zhiqiang-hhhh

run buildall

zhiqiang-hhhh avatar Apr 23 '24 14:04 zhiqiang-hhhh

ClickBench: Total hot run time: 31.5 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 1d7637fbda410de137ba7f0e77cefeb7f6503843, data reload: false

query1	0.04	0.03	0.03
query2	0.08	0.04	0.04
query3	0.23	0.05	0.05
query4	1.67	0.07	0.07
query5	0.50	0.53	0.50
query6	1.37	0.89	0.82
query7	0.03	0.02	0.01
query8	0.06	0.04	0.05
query9	0.49	0.44	0.45
query10	0.50	0.51	0.51
query11	0.15	0.11	0.11
query12	0.14	0.12	0.12
query13	0.63	0.64	0.62
query14	0.91	1.06	0.93
query15	0.87	0.85	0.86
query16	0.37	0.38	0.38
query17	1.05	1.03	0.99
query18	0.21	0.26	0.20
query19	1.95	1.88	1.81
query20	0.01	0.01	0.02
query21	15.46	0.65	0.65
query22	4.31	7.99	1.75
query23	18.30	1.42	1.28
query24	1.60	0.34	0.25
query25	0.15	0.09	0.09
query26	0.27	0.18	0.17
query27	0.09	0.09	0.09
query28	13.40	1.02	1.01
query29	12.67	3.44	3.45
query30	0.26	0.08	0.06
query31	2.86	0.41	0.41
query32	3.22	0.50	0.50
query33	2.75	2.98	2.97
query34	17.25	4.69	4.57
query35	4.57	4.56	4.66
query36	0.65	0.47	0.46
query37	0.21	0.17	0.18
query38	0.20	0.19	0.20
query39	0.05	0.04	0.05
query40	0.19	0.16	0.14
query41	0.11	0.06	0.06
query42	0.07	0.06	0.07
query43	0.06	0.05	0.05
Total cold run time: 109.96 s
Total hot run time: 31.5 s

doris-robot avatar Apr 23 '24 14:04 doris-robot

run buildall

zhiqiang-hhhh avatar Apr 24 '24 02:04 zhiqiang-hhhh

run buildall

zhiqiang-hhhh avatar Apr 24 '24 02:04 zhiqiang-hhhh

run buildall

zhiqiang-hhhh avatar Apr 24 '24 03:04 zhiqiang-hhhh

run buildall

zhiqiang-hhhh avatar Apr 24 '24 11:04 zhiqiang-hhhh

run buildall

zhiqiang-hhhh avatar Apr 28 '24 08:04 zhiqiang-hhhh

TPC-H: Total hot run time: 40282 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ea9ed22e00ad432632abdfed7fad9efa1232b930, data reload: false

------ Round 1 ----------------------------------
q1	17613	4523	4377	4377
q2	2586	191	200	191
q3	12113	1215	1250	1215
q4	10265	742	822	742
q5	8709	2791	2687	2687
q6	220	133	137	133
q7	1026	636	626	626
q8	9395	2123	2075	2075
q9	9271	6630	6573	6573
q10	8638	3706	3768	3706
q11	454	237	244	237
q12	393	220	224	220
q13	17770	2964	2971	2964
q14	277	241	238	238
q15	505	479	465	465
q16	510	379	391	379
q17	994	637	704	637
q18	7957	7423	7476	7423
q19	1587	1526	1506	1506
q20	665	307	298	298
q21	5010	3325	4044	3325
q22	333	278	265	265
Total cold run time: 116291 ms
Total hot run time: 40282 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4280	4192	4215	4192
q2	371	269	282	269
q3	3024	2739	2729	2729
q4	1860	1607	1617	1607
q5	5285	5332	5292	5292
q6	210	128	125	125
q7	2246	1903	1898	1898
q8	3206	3326	3360	3326
q9	8615	8496	8535	8496
q10	3906	3697	3684	3684
q11	581	482	491	482
q12	763	573	576	573
q13	16333	2936	2976	2936
q14	301	292	260	260
q15	515	483	476	476
q16	486	420	428	420
q17	1777	1478	1481	1478
q18	7514	7493	7477	7477
q19	4567	1498	1548	1498
q20	1984	1754	1759	1754
q21	4956	4933	4886	4886
q22	579	476	506	476
Total cold run time: 73359 ms
Total hot run time: 54334 ms

doris-robot avatar Apr 28 '24 09:04 doris-robot

TPC-DS: Total hot run time: 187016 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ea9ed22e00ad432632abdfed7fad9efa1232b930, data reload: false

query1	904	352	355	352
query2	6459	2566	2417	2417
query3	6650	219	214	214
query4	22971	21225	21118	21118
query5	4163	415	418	415
query6	269	174	171	171
query7	4583	289	289	289
query8	240	192	181	181
query9	8697	2395	2389	2389
query10	451	242	262	242
query11	14639	14078	14149	14078
query12	138	92	87	87
query13	1637	369	390	369
query14	10410	7181	8357	7181
query15	247	173	170	170
query16	8131	261	267	261
query17	1845	609	541	541
query18	2082	277	277	277
query19	195	149	144	144
query20	93	85	83	83
query21	203	125	130	125
query22	5115	4991	4875	4875
query23	33931	33521	33223	33223
query24	10847	2997	2918	2918
query25	630	382	414	382
query26	1220	153	149	149
query27	3007	323	320	320
query28	7349	2062	2024	2024
query29	848	592	609	592
query30	274	150	153	150
query31	942	720	728	720
query32	92	52	56	52
query33	758	251	237	237
query34	1082	470	481	470
query35	791	696	678	678
query36	1095	897	924	897
query37	139	65	69	65
query38	3138	2994	3035	2994
query39	1579	1511	1553	1511
query40	202	123	122	122
query41	42	36	36	36
query42	101	93	105	93
query43	569	526	573	526
query44	1249	720	744	720
query45	273	258	253	253
query46	1074	707	695	695
query47	1949	1850	1849	1849
query48	377	295	308	295
query49	1070	396	404	396
query50	768	398	397	397
query51	6805	6694	6616	6616
query52	101	92	88	88
query53	346	280	276	276
query54	303	233	236	233
query55	83	72	76	72
query56	237	219	220	219
query57	1192	1127	1131	1127
query58	226	200	202	200
query59	3460	3240	3420	3240
query60	269	242	256	242
query61	90	89	88	88
query62	674	441	433	433
query63	304	276	276	276
query64	8575	7142	7195	7142
query65	3108	3003	3065	3003
query66	1423	330	329	329
query67	15368	15041	15520	15041
query68	5240	543	570	543
query69	491	303	309	303
query70	1176	1081	1143	1081
query71	393	279	272	272
query72	7921	2640	2410	2410
query73	703	331	341	331
query74	6492	6073	6101	6073
query75	3387	2722	2658	2658
query76	2957	956	1071	956
query77	414	266	269	266
query78	10796	10600	10351	10351
query79	3377	525	518	518
query80	1831	434	435	434
query81	555	230	217	217
query82	756	95	94	94
query83	278	235	171	171
query84	271	89	85	85
query85	2034	267	259	259
query86	520	302	312	302
query87	3316	3119	3131	3119
query88	4656	2433	2424	2424
query89	482	378	374	374
query90	2005	185	184	184
query91	125	96	98	96
query92	56	49	46	46
query93	4935	518	506	506
query94	1240	180	179	179
query95	1094	1098	1093	1093
query96	606	273	271	271
query97	3164	2950	2994	2950
query98	226	222	216	216
query99	1249	868	866	866
Total cold run time: 290828 ms
Total hot run time: 187016 ms

doris-robot avatar Apr 28 '24 09:04 doris-robot

run buildall

zhiqiang-hhhh avatar Apr 30 '24 02:04 zhiqiang-hhhh

run buildall

zhiqiang-hhhh avatar Apr 30 '24 03:04 zhiqiang-hhhh

TPC-DS: Total hot run time: 187926 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit b6c1e62ea1b189e1da2fee22e68dc9f94a1e8a38, data reload: false

query1	917	365	351	351
query2	6281	2444	2354	2354
query3	6681	215	216	215
query4	22897	21830	22011	21830
query5	3819	464	421	421
query6	261	188	184	184
query7	4541	293	306	293
query8	239	189	182	182
query9	8595	2500	2514	2500
query10	432	273	260	260
query11	15125	14789	14916	14789
query12	121	90	89	89
query13	1652	365	375	365
query14	8817	7733	7398	7398
query15	253	166	168	166
query16	8153	275	261	261
query17	1731	557	542	542
query18	2111	273	269	269
query19	319	147	147	147
query20	87	82	87	82
query21	192	124	140	124
query22	5129	4902	4897	4897
query23	34053	33385	33369	33369
query24	10635	2862	3000	2862
query25	590	372	373	372
query26	1162	157	147	147
query27	2336	346	311	311
query28	7176	2091	2076	2076
query29	860	589	596	589
query30	245	148	150	148
query31	944	755	705	705
query32	99	52	53	52
query33	729	242	250	242
query34	1034	484	488	484
query35	784	666	674	666
query36	1022	908	882	882
query37	132	69	65	65
query38	3189	3038	3121	3038
query39	1575	1539	1533	1533
query40	198	125	124	124
query41	41	38	38	38
query42	104	101	106	101
query43	578	551	528	528
query44	1161	723	728	723
query45	262	257	262	257
query46	1076	731	735	731
query47	1938	1865	1856	1856
query48	380	296	302	296
query49	896	390	391	390
query50	790	383	388	383
query51	6864	6664	6584	6584
query52	99	91	89	89
query53	355	281	281	281
query54	310	240	234	234
query55	81	74	77	74
query56	259	232	236	232
query57	1202	1145	1123	1123
query58	231	205	211	205
query59	3354	3155	3114	3114
query60	267	248	247	247
query61	109	149	89	89
query62	632	451	446	446
query63	305	280	283	280
query64	8635	7250	7196	7196
query65	3115	3052	3032	3032
query66	814	343	334	334
query67	15718	15229	14930	14930
query68	5182	552	546	546
query69	498	300	305	300
query70	1213	1084	1120	1084
query71	411	278	277	277
query72	7977	2546	2382	2382
query73	730	326	326	326
query74	6461	6187	6253	6187
query75	3516	2652	2662	2652
query76	3302	1003	981	981
query77	464	276	273	273
query78	10893	10426	10277	10277
query79	8200	535	504	504
query80	2032	429	434	429
query81	545	225	218	218
query82	1555	98	89	89
query83	307	166	210	166
query84	268	83	85	83
query85	1947	260	276	260
query86	481	308	297	297
query87	3263	3109	3070	3070
query88	5338	2412	2398	2398
query89	472	378	381	378
query90	2000	186	183	183
query91	126	99	99	99
query92	57	48	46	46
query93	6165	510	502	502
query94	1160	187	183	183
query95	390	294	297	294
query96	609	268	264	264
query97	3124	2976	2949	2949
query98	242	219	218	218
query99	1227	894	834	834
Total cold run time: 294847 ms
Total hot run time: 187926 ms

doris-robot avatar Apr 30 '24 04:04 doris-robot

run buildall

zhiqiang-hhhh avatar May 02 '24 02:05 zhiqiang-hhhh

run buildall

zhiqiang-hhhh avatar May 02 '24 02:05 zhiqiang-hhhh

run buildall

zhiqiang-hhhh avatar May 02 '24 02:05 zhiqiang-hhhh

run buildall

zhiqiang-hhhh avatar May 02 '24 03:05 zhiqiang-hhhh

run buildall

zhiqiang-hhhh avatar May 02 '24 15:05 zhiqiang-hhhh

TPC-H: Total hot run time: 40043 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 70ef7c9707b85ffb3a03d3b361cd2ba4d9a4f184, data reload: false

------ Round 1 ----------------------------------
q1	17812	4508	4322	4322
q2	2697	192	197	192
q3	11415	1142	1211	1142
q4	10432	841	843	841
q5	7923	2689	2712	2689
q6	212	129	127	127
q7	1021	582	553	553
q8	9215	2083	2041	2041
q9	9065	6557	6529	6529
q10	8952	3754	3721	3721
q11	489	242	241	241
q12	461	224	231	224
q13	18298	2967	2976	2967
q14	261	217	209	209
q15	508	480	479	479
q16	512	378	380	378
q17	970	623	694	623
q18	8089	7444	7366	7366
q19	1898	1522	1498	1498
q20	648	299	312	299
q21	5109	3953	3322	3322
q22	347	280	285	280
Total cold run time: 116334 ms
Total hot run time: 40043 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4344	4219	4167	4167
q2	379	256	273	256
q3	2970	2748	2773	2748
q4	1889	1607	1611	1607
q5	5237	5276	5251	5251
q6	221	129	129	129
q7	2238	1908	1869	1869
q8	3203	3331	3348	3331
q9	8428	8386	8418	8386
q10	3949	3737	3645	3645
q11	576	488	486	486
q12	754	562	573	562
q13	16300	2917	2972	2917
q14	289	270	258	258
q15	520	487	471	471
q16	472	408	435	408
q17	1777	1468	1451	1451
q18	7544	7493	7406	7406
q19	1695	1549	1550	1549
q20	1979	1768	1778	1768
q21	4914	5030	4786	4786
q22	577	489	540	489
Total cold run time: 70255 ms
Total hot run time: 53940 ms

doris-robot avatar May 02 '24 16:05 doris-robot

TPC-DS: Total hot run time: 184893 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 70ef7c9707b85ffb3a03d3b361cd2ba4d9a4f184, data reload: false

query1	910	362	346	346
query2	6627	2375	2284	2284
query3	6700	210	213	210
query4	23675	21187	21239	21187
query5	4129	422	420	420
query6	272	187	177	177
query7	4588	304	292	292
query8	247	194	196	194
query9	8562	2350	2301	2301
query10	438	247	248	247
query11	14674	14060	14045	14045
query12	136	92	90	90
query13	1638	371	387	371
query14	9678	6752	8188	6752
query15	223	175	170	170
query16	8018	257	264	257
query17	1860	568	534	534
query18	2055	279	270	270
query19	208	153	149	149
query20	97	88	84	84
query21	199	129	124	124
query22	4980	4776	4817	4776
query23	33832	33124	33129	33124
query24	12082	2978	2897	2897
query25	657	365	357	357
query26	1800	156	150	150
query27	3122	314	316	314
query28	7386	2017	1982	1982
query29	1037	590	593	590
query30	306	148	149	148
query31	975	745	727	727
query32	88	51	51	51
query33	743	248	244	244
query34	1063	478	489	478
query35	869	643	674	643
query36	1093	914	904	904
query37	285	67	67	67
query38	3130	3000	2973	2973
query39	1569	1533	1543	1533
query40	266	125	130	125
query41	40	39	39	39
query42	105	98	97	97
query43	562	536	552	536
query44	1255	744	748	744
query45	271	258	253	253
query46	1086	745	712	712
query47	1938	1862	1838	1838
query48	388	300	309	300
query49	1211	406	429	406
query50	772	395	390	390
query51	6767	6657	6694	6657
query52	98	91	92	91
query53	369	292	290	290
query54	327	244	258	244
query55	80	74	72	72
query56	252	231	233	231
query57	1232	1141	1139	1139
query58	241	206	226	206
query59	3438	3216	2954	2954
query60	277	243	245	243
query61	111	105	105	105
query62	696	458	439	439
query63	316	288	283	283
query64	9744	7363	7359	7359
query65	3126	3037	3053	3037
query66	1397	355	338	338
query67	15383	14990	14894	14894
query68	9388	552	546	546
query69	537	309	318	309
query70	1252	1099	1141	1099
query71	512	284	283	283
query72	8097	2501	2353	2353
query73	1477	318	324	318
query74	6445	6229	6001	6001
query75	4336	2609	2663	2609
query76	5576	966	1028	966
query77	667	261	273	261
query78	11080	10131	10222	10131
query79	12268	516	510	510
query80	2795	442	440	440
query81	505	232	217	217
query82	235	91	99	91
query83	224	169	168	168
query84	268	86	82	82
query85	1176	267	266	266
query86	350	298	315	298
query87	3325	3072	3088	3072
query88	5637	2427	2428	2427
query89	528	381	371	371
query90	2414	183	179	179
query91	129	94	98	94
query92	60	48	46	46
query93	7257	524	500	500
query94	1606	192	182	182
query95	393	307	303	303
query96	628	266	269	266
query97	3161	2948	2924	2924
query98	245	218	215	215
query99	1224	882	895	882
Total cold run time: 315494 ms
Total hot run time: 184893 ms

doris-robot avatar May 02 '24 16:05 doris-robot

run buildall

zhiqiang-hhhh avatar May 14 '24 06:05 zhiqiang-hhhh