doris icon indicating copy to clipboard operation
doris copied to clipboard

[feature](iceberg catalog) support iceberg view query

Open heguanhui opened this issue 6 months ago • 16 comments

What problem does this PR solve?

This feature addresses the issue that Doris cannot query Iceberg views.

  1. Prerequisite Dependencies Iceberg version 1.7.x+
  2. Design Approach
  • Enhanced View Loading in IcebergMetadataCache:

Added view loading capability during loadTable operations for Iceberg.

Introduced getIcebergView() method to retrieve Iceberg views.

  • New Methods in IcebergExternalTable:

isView(): Determines if the current Iceberg table is a view.

getViewText(): Retrieves the SQL definition query of the view.

  • Execution Plan Generation Adjustment:

Checks isView() during plan generation to identify Iceberg views.

For views, generates a logical subquery execution plan by flattening the view into a subquery.

  • Cache Invalidation Enhancement:

Added view cache invalidation in IcebergMetadataCache when invalidating catalog/database/table caches.

Problem Summary:

This PR aims to resolve the issue where querying an Iceberg view in Doris results in an error indicating the table does not exist. Specifically, it addresses the problem that Doris currently cannot recognize and query Iceberg views, causing exceptions when users attempt to access view data.

Check List (For Author)

Test (At least one of them must be included): Manual test scripts and report as below: 回归测试用例.docx

Behavior changed:

Yes.

  • BindRelation.java

Modified getLogicalPlan method: Adjusted logic for Iceberg external tables to generate subquery execution plans when encountering views. Renamed parseAndAnalyzeHiveView to parseAndAnalyzeExternalView: Unified handling for external views (Hive/Iceberg).

  • Config.java

    (1) Added enable_query_iceberg_views: Configuration flag to enable/disable Iceberg view query support.

  • IcebergExternalTable.java

    (1) Enhanced initSchema method: Added schema initialization compatibility for both Iceberg tables and views.

    (2) New getViewText method: Retrieves the SQL definition of an Iceberg view.

  • IcebergMetadataCache.java

    (1) Constructor Initialization: Added cache loading for Iceberg views.

    (2) Invalidation Methods: Extended invalidateCatalogCache, invalidateTableCache, invalidateDbCache to include view cache invalidation.

    New Methods:

    (1) loadView: Loads Iceberg view metadata.

    (2) getIcebergView: Retrieves cached Iceberg view information.

  • IcebergMetadataOps.java

    (1) Modified listTableNames: Returns combined list of tables and views.

    New Methods:

    (1) viewExists: Checks if an Iceberg view exists.

    (2) loadView: Loads Iceberg view metadata.

  • IcebergUtils.java

    Modified Methods:

    (1) getSchema: Supports schema retrieval for both tables and views.

    (2) loadSchemaCacheValue: Handles schema caching for both types.

    New Methods:

    (1) getIcebergView/getIcebergViewInternal: Retrieve Iceberg view objects.

    (2) getConvertedSchema: Converts Iceberg schema to Doris-compatible format.

  • ExternalMetadataOps.java

    New Interface Methods:

    (1) loadView: Load external view metadata.

    (2) - viewExists: Check view existence.

    (3) listViewNames: List available views.

  • IcebergExternalCatalog.java

    (1) Modified listTableNames: Returns combined tables and views.

    (2) Overridden viewExists: Implements view existence check.

  • ExternalCatalog.java

    (1) New Interface Method: viewExists to check view existence.

  • ExternalTable.java

    New Methods:

    (1) isTable: Determines if the object is a table (vs. view).

    (2) getRemoteDbName: Retrieves the remote database name.

  • HMSExternalTable.java

    Modified Methods:

    (1) initSchema: Added view compatibility.

    (2) getIcebergSchema: Supports both tables and views.

    (3) Overridden isTable: Implements table type check.

  • IcebergApiSource.java

    (1) Constructor Adjustment: Added view type validation to prevent invalid operations.

  • IcebergTableSink.java

    (1) Constructor Adjustment: Added view type validation to prevent invalid operations.

Does this need documentation? No.

heguanhui avatar May 30 '25 04:05 heguanhui

Thank you for your contribution to Apache Doris. Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

Thearas avatar May 30 '25 04:05 Thearas

And please add necessary UT and regression test for these feature.

morningman avatar Jun 01 '25 15:06 morningman

run buildall

morningman avatar Jun 17 '25 10:06 morningman

Cloud UT Coverage Report

Increment line coverage :tada:

Increment coverage report Complete coverage report

Category Coverage
Function Coverage 83.33% (1120/1344)
Line Coverage 66.82% (19318/28909)
Region Coverage 66.53% (9574/14390)
Branch Coverage 56.53% (5210/9216)

hello-stephen avatar Jun 17 '25 11:06 hello-stephen

run buildall

morningman avatar Jun 17 '25 12:06 morningman

which TABLE_TYPE will return by iceberg's view when query information.tables?

heguanhui avatar Jun 18 '25 02:06 heguanhui

run buildall

morningman avatar Jun 18 '25 02:06 morningman

which TABLE_TYPE will return by iceberg's view when query information.tables?

it will return iceberg, same as iceberg table. And also, for hive view, it return hive, same as hive table.

I will rethink of this logic since currently there is no standard

morningman avatar Jun 18 '25 02:06 morningman

TPC-H: Total hot run time: 34056 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ff0969c50c761c36cd31b4690d39acd0d0e997b5, data reload: false

------ Round 1 ----------------------------------
q1	17639	5220	4989	4989
q2	1920	295	200	200
q3	10293	1306	738	738
q4	10234	1000	521	521
q5	7537	2344	2384	2344
q6	179	180	131	131
q7	904	751	606	606
q8	9324	1525	1128	1128
q9	6812	5116	5142	5116
q10	6897	2385	1958	1958
q11	499	280	271	271
q12	350	356	216	216
q13	17756	3652	3078	3078
q14	241	240	225	225
q15	542	480	470	470
q16	428	432	368	368
q17	591	851	371	371
q18	7579	7242	7127	7127
q19	1218	942	558	558
q20	337	337	224	224
q21	3760	3145	2443	2443
q22	1101	1030	974	974
Total cold run time: 106141 ms
Total hot run time: 34056 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5462	5035	5041	5035
q2	249	324	226	226
q3	2150	2667	2289	2289
q4	1333	1798	1349	1349
q5	4259	4092	4447	4092
q6	209	167	130	130
q7	2022	1910	1780	1780
q8	2628	2533	2548	2533
q9	7143	7188	7083	7083
q10	3034	3293	2888	2888
q11	570	507	489	489
q12	686	760	625	625
q13	3446	3856	3325	3325
q14	295	294	264	264
q15	527	479	480	479
q16	435	477	449	449
q17	1177	1481	1421	1421
q18	7766	7610	7384	7384
q19	812	825	884	825
q20	1974	1980	1888	1888
q21	4989	4491	4335	4335
q22	1092	1061	994	994
Total cold run time: 52258 ms
Total hot run time: 49883 ms

doris-robot avatar Jun 18 '25 02:06 doris-robot

TPC-DS: Total hot run time: 185762 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ff0969c50c761c36cd31b4690d39acd0d0e997b5, data reload: false

query1	999	394	382	382
query2	6532	1832	1866	1832
query3	6745	219	232	219
query4	26341	23481	23509	23481
query5	4353	628	467	467
query6	317	226	199	199
query7	4637	486	288	288
query8	267	234	223	223
query9	8647	2637	2648	2637
query10	491	358	279	279
query11	15802	15287	14868	14868
query12	161	113	107	107
query13	1664	520	412	412
query14	10063	6123	6102	6102
query15	200	190	176	176
query16	7204	654	496	496
query17	1206	726	592	592
query18	1996	436	310	310
query19	200	190	171	171
query20	128	122	130	122
query21	220	133	108	108
query22	4062	4135	3990	3990
query23	34115	33102	33023	33023
query24	7747	2383	2365	2365
query25	551	459	393	393
query26	1244	267	151	151
query27	2645	478	345	345
query28	4379	2138	2114	2114
query29	762	576	425	425
query30	280	222	192	192
query31	946	870	762	762
query32	73	65	64	64
query33	568	372	317	317
query34	805	829	529	529
query35	778	822	745	745
query36	940	1021	883	883
query37	111	96	83	83
query38	4076	4079	4036	4036
query39	1499	1429	1416	1416
query40	214	123	110	110
query41	64	60	61	60
query42	134	108	108	108
query43	503	504	480	480
query44	1319	812	825	812
query45	182	167	168	167
query46	847	1016	620	620
query47	1782	1804	1706	1706
query48	398	412	330	330
query49	747	490	395	395
query50	628	674	405	405
query51	4101	4142	4197	4142
query52	111	109	96	96
query53	218	248	182	182
query54	576	569	505	505
query55	82	86	81	81
query56	307	295	278	278
query57	1193	1200	1130	1130
query58	278	266	249	249
query59	2697	2736	2641	2641
query60	322	317	302	302
query61	126	156	126	126
query62	797	717	685	685
query63	229	191	191	191
query64	4388	1036	667	667
query65	4250	4175	4229	4175
query66	1161	406	320	320
query67	15662	15446	15218	15218
query68	8254	877	511	511
query69	472	298	271	271
query70	1189	1117	1082	1082
query71	485	336	308	308
query72	5698	4725	4647	4647
query73	696	583	353	353
query74	9209	9071	8721	8721
query75	3972	3191	2711	2711
query76	3725	1200	764	764
query77	869	369	291	291
query78	10034	10053	9329	9329
query79	2664	784	575	575
query80	627	537	441	441
query81	501	265	234	234
query82	684	131	101	101
query83	294	244	236	236
query84	293	113	93	93
query85	781	352	364	352
query86	395	311	309	309
query87	4393	4446	4450	4446
query88	3678	2279	2272	2272
query89	393	310	287	287
query90	1846	210	206	206
query91	148	147	115	115
query92	75	66	58	58
query93	1753	935	572	572
query94	669	414	304	304
query95	373	297	284	284
query96	509	560	284	284
query97	2719	2812	2608	2608
query98	244	205	206	205
query99	1460	1421	1307	1307
Total cold run time: 276172 ms
Total hot run time: 185762 ms

doris-robot avatar Jun 18 '25 03:06 doris-robot

ClickBench: Total hot run time: 29.74 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit ff0969c50c761c36cd31b4690d39acd0d0e997b5, data reload: false

query1	0.04	0.04	0.03
query2	0.07	0.04	0.04
query3	0.23	0.07	0.07
query4	1.62	0.10	0.10
query5	0.45	0.42	0.42
query6	1.20	0.64	0.67
query7	0.03	0.01	0.01
query8	0.05	0.04	0.04
query9	0.56	0.52	0.51
query10	0.57	0.58	0.56
query11	0.15	0.11	0.11
query12	0.15	0.12	0.12
query13	0.63	0.61	0.62
query14	0.80	0.82	0.85
query15	0.89	0.88	0.86
query16	0.40	0.37	0.38
query17	1.10	1.04	1.04
query18	0.23	0.22	0.21
query19	1.93	1.85	1.86
query20	0.02	0.01	0.01
query21	15.40	0.90	0.56
query22	0.77	1.22	0.66
query23	14.90	1.37	0.63
query24	7.40	1.43	0.91
query25	0.50	0.21	0.08
query26	0.61	0.16	0.14
query27	0.07	0.05	0.05
query28	9.69	0.87	0.45
query29	12.56	4.02	3.34
query30	0.26	0.09	0.06
query31	2.83	0.60	0.40
query32	3.26	0.56	0.48
query33	3.15	3.12	3.09
query34	16.00	5.38	4.81
query35	4.85	4.83	4.90
query36	0.68	0.50	0.49
query37	0.08	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.02	0.02
query40	0.18	0.15	0.15
query41	0.09	0.02	0.02
query42	0.03	0.03	0.02
query43	0.04	0.04	0.03
Total cold run time: 104.55 s
Total hot run time: 29.74 s

doris-robot avatar Jun 18 '25 03:06 doris-robot

FE UT Coverage Report

Increment line coverage 4.29% (9/210) :tada: Increment coverage report Complete coverage report

hello-stephen avatar Jun 18 '25 04:06 hello-stephen

run buildall

morningman avatar Jun 18 '25 10:06 morningman

TPC-H: Total hot run time: 33918 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 79097b1953e6a9f086a82d25d8683f1abf48adfe, data reload: false

------ Round 1 ----------------------------------
q1	17583	5231	4997	4997
q2	1929	279	187	187
q3	10380	1284	728	728
q4	10287	1005	523	523
q5	8399	2450	2339	2339
q6	187	162	132	132
q7	922	751	632	632
q8	9346	1334	1064	1064
q9	6808	5037	5079	5037
q10	6887	2384	1955	1955
q11	494	297	277	277
q12	343	353	212	212
q13	17775	3708	3134	3134
q14	231	227	212	212
q15	564	474	479	474
q16	424	440	378	378
q17	583	860	350	350
q18	7689	7128	7128	7128
q19	1650	942	563	563
q20	338	340	216	216
q21	3872	3191	2433	2433
q22	1051	1017	947	947
Total cold run time: 107742 ms
Total hot run time: 33918 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5086	5030	5071	5030
q2	242	325	223	223
q3	2197	2705	2312	2312
q4	1346	1872	1355	1355
q5	4261	4129	4491	4129
q6	219	171	129	129
q7	2030	1934	1785	1785
q8	2654	2634	2503	2503
q9	7146	7099	7154	7099
q10	3110	3334	2809	2809
q11	588	517	502	502
q12	690	773	638	638
q13	3494	3910	3281	3281
q14	287	297	268	268
q15	509	483	471	471
q16	455	487	431	431
q17	1143	1577	1345	1345
q18	7828	7452	7314	7314
q19	794	794	978	794
q20	1981	2132	1892	1892
q21	5445	4520	4183	4183
q22	1018	1001	973	973
Total cold run time: 52523 ms
Total hot run time: 49466 ms

doris-robot avatar Jun 18 '25 11:06 doris-robot

TPC-DS: Total hot run time: 187047 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 79097b1953e6a9f086a82d25d8683f1abf48adfe, data reload: false

query1	1006	401	393	393
query2	6542	1929	1931	1929
query3	6750	229	221	221
query4	26337	24092	23710	23710
query5	4343	643	497	497
query6	306	214	203	203
query7	4634	495	305	305
query8	270	237	220	220
query9	8613	2650	2660	2650
query10	483	338	273	273
query11	15882	15152	14874	14874
query12	171	111	110	110
query13	1658	538	409	409
query14	9068	6223	6019	6019
query15	208	198	186	186
query16	7164	608	461	461
query17	1192	732	567	567
query18	1981	402	314	314
query19	192	190	182	182
query20	129	122	114	114
query21	214	121	117	117
query22	4079	4083	4043	4043
query23	34013	33095	33124	33095
query24	8431	2371	2434	2371
query25	564	490	402	402
query26	1220	266	153	153
query27	2764	510	352	352
query28	4318	2139	2122	2122
query29	780	560	455	455
query30	288	223	194	194
query31	925	855	746	746
query32	74	63	93	63
query33	546	365	315	315
query34	808	856	559	559
query35	787	810	767	767
query36	973	998	886	886
query37	111	99	82	82
query38	4093	4252	4000	4000
query39	1475	1415	1412	1412
query40	215	118	107	107
query41	63	61	59	59
query42	132	111	123	111
query43	510	528	491	491
query44	1362	822	824	822
query45	205	173	171	171
query46	850	1039	633	633
query47	1755	1755	1717	1717
query48	386	440	318	318
query49	768	480	401	401
query50	682	689	407	407
query51	4172	4264	4011	4011
query52	114	108	97	97
query53	226	255	180	180
query54	577	574	508	508
query55	88	80	83	80
query56	300	305	299	299
query57	1183	1204	1130	1130
query58	277	258	266	258
query59	2641	2727	2733	2727
query60	338	332	317	317
query61	130	124	126	124
query62	800	729	648	648
query63	238	197	197	197
query64	4458	1102	765	765
query65	4284	4216	4216	4216
query66	1171	410	311	311
query67	15849	15663	15470	15470
query68	7766	885	532	532
query69	472	318	265	265
query70	1151	1133	1136	1133
query71	429	335	293	293
query72	5511	4783	4792	4783
query73	649	633	351	351
query74	9270	9174	8785	8785
query75	3508	3213	2736	2736
query76	3372	1212	785	785
query77	751	420	313	313
query78	10064	10098	9365	9365
query79	2292	858	595	595
query80	599	519	454	454
query81	501	273	227	227
query82	449	131	101	101
query83	258	255	240	240
query84	248	112	91	91
query85	798	348	318	318
query86	383	306	279	279
query87	4485	4577	4430	4430
query88	3981	2280	2286	2280
query89	387	353	287	287
query90	1853	210	213	210
query91	141	146	116	116
query92	79	66	61	61
query93	1875	958	594	594
query94	695	416	314	314
query95	375	299	293	293
query96	501	575	283	283
query97	2744	2766	2686	2686
query98	238	217	205	205
query99	1469	1383	1296	1296
Total cold run time: 274460 ms
Total hot run time: 187047 ms

doris-robot avatar Jun 18 '25 11:06 doris-robot

FE UT Coverage Report

Increment line coverage 4.57% (9/197) :tada: Increment coverage report Complete coverage report

hello-stephen avatar Jun 18 '25 12:06 hello-stephen

run buildall

morningman avatar Jun 20 '25 05:06 morningman

TPC-H: Total hot run time: 34185 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit f105dffb4303ac09a24b860f39a5338d1bf55837, data reload: false

------ Round 1 ----------------------------------
q1	17624	5195	5096	5096
q2	1938	308	197	197
q3	10257	1303	731	731
q4	10217	1012	533	533
q5	7497	2363	2369	2363
q6	185	160	137	137
q7	899	770	601	601
q8	9315	1296	1164	1164
q9	6625	5070	5115	5070
q10	6917	2376	1964	1964
q11	496	290	278	278
q12	351	344	219	219
q13	17785	3669	3112	3112
q14	242	234	218	218
q15	553	476	482	476
q16	439	425	380	380
q17	624	905	386	386
q18	7735	7177	7012	7012
q19	1615	959	591	591
q20	345	338	234	234
q21	4052	3358	2452	2452
q22	1032	1039	971	971
Total cold run time: 106743 ms
Total hot run time: 34185 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5236	5126	5178	5126
q2	249	327	223	223
q3	2258	2700	2359	2359
q4	1353	1821	1376	1376
q5	4204	4161	4390	4161
q6	256	179	137	137
q7	1975	1943	1758	1758
q8	2637	2592	2538	2538
q9	7063	6976	7035	6976
q10	3092	3268	2816	2816
q11	581	504	498	498
q12	702	758	613	613
q13	3949	3928	3320	3320
q14	278	304	281	281
q15	519	475	465	465
q16	435	501	433	433
q17	1147	1504	1373	1373
q18	7284	7028	6942	6942
q19	786	792	943	792
q20	1973	1928	1802	1802
q21	4739	4335	4281	4281
q22	1078	1065	998	998
Total cold run time: 51794 ms
Total hot run time: 49268 ms

doris-robot avatar Jun 20 '25 06:06 doris-robot

TPC-DS: Total hot run time: 185875 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit f105dffb4303ac09a24b860f39a5338d1bf55837, data reload: false

query1	979	404	391	391
query2	6511	1884	1873	1873
query3	6744	231	224	224
query4	26463	23718	23031	23031
query5	4383	652	488	488
query6	306	215	211	211
query7	4629	487	305	305
query8	268	232	214	214
query9	8623	2623	2640	2623
query10	479	328	284	284
query11	15754	15190	14973	14973
query12	181	116	107	107
query13	1656	549	438	438
query14	9781	6134	6294	6134
query15	205	198	173	173
query16	7643	632	448	448
query17	1195	686	569	569
query18	2005	404	300	300
query19	190	187	157	157
query20	117	113	115	113
query21	216	133	109	109
query22	3983	4255	4053	4053
query23	34025	32986	33217	32986
query24	8392	2368	2413	2368
query25	535	469	395	395
query26	1231	278	154	154
query27	2742	515	374	374
query28	4316	2113	2085	2085
query29	738	541	438	438
query30	283	220	197	197
query31	928	829	754	754
query32	69	69	61	61
query33	561	391	339	339
query34	806	849	543	543
query35	802	822	764	764
query36	959	1004	862	862
query37	114	107	83	83
query38	4072	4086	4040	4040
query39	1489	1423	1390	1390
query40	217	121	118	118
query41	64	61	63	61
query42	132	111	113	111
query43	496	519	482	482
query44	1360	847	829	829
query45	182	172	169	169
query46	864	1040	627	627
query47	1728	1755	1675	1675
query48	385	425	317	317
query49	732	485	410	410
query50	662	693	423	423
query51	4108	4156	4141	4141
query52	116	111	108	108
query53	231	261	191	191
query54	582	585	527	527
query55	90	92	91	91
query56	325	307	321	307
query57	1211	1176	1117	1117
query58	260	259	257	257
query59	2686	2713	2583	2583
query60	333	332	317	317
query61	130	123	121	121
query62	817	709	683	683
query63	234	196	187	187
query64	4243	1026	681	681
query65	4225	4177	4179	4177
query66	1084	410	335	335
query67	15730	15447	15331	15331
query68	6550	899	539	539
query69	491	305	280	280
query70	1197	1108	1111	1108
query71	419	337	314	314
query72	5153	4655	4564	4564
query73	622	573	357	357
query74	9331	9094	8977	8977
query75	3180	3194	2733	2733
query76	3275	1192	780	780
query77	483	383	293	293
query78	10023	10176	9256	9256
query79	1700	784	599	599
query80	675	502	461	461
query81	510	265	220	220
query82	184	128	97	97
query83	257	251	237	237
query84	254	111	91	91
query85	754	363	360	360
query86	356	331	298	298
query87	4364	4506	4344	4344
query88	2910	2280	2306	2280
query89	399	313	280	280
query90	1738	210	209	209
query91	145	144	115	115
query92	68	62	58	58
query93	1040	945	599	599
query94	624	416	319	319
query95	381	288	295	288
query96	492	568	284	284
query97	2704	2760	2645	2645
query98	233	204	210	204
query99	1323	1399	1259	1259
Total cold run time: 269435 ms
Total hot run time: 185875 ms

doris-robot avatar Jun 20 '25 06:06 doris-robot

PR approved by at least one committer and no changes requested.

github-actions[bot] avatar Jun 20 '25 13:06 github-actions[bot]

PR approved by anyone and no changes requested.

github-actions[bot] avatar Jun 20 '25 13:06 github-actions[bot]