iceberg
iceberg copied to clipboard
Create iceberg table throw exception when Hive Ranger is enabled
Apache Iceberg version
0.14.0 (latest release)
Query engine
Hive
Please describe the bug 🐞
Using iceberg hive runtime 0.14 on Hive 3.X from Cloudera Run command
CREATE EXTERNAL TABLE TAB1 (
C1 int,
C2 int
)
STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'
TBLPROPERTIES (
'external.table.purge' = 'true',
'transactional' = 'false'
)
Get exception
2022-09-14T13:15:15,949 WARN com.ibm.biginsights.catalog.translator.hive.HiveExceptionTranslator [Thread-21] : [BSL-0-33c23b26d]: Exception: FAILED: Execution Error, return code 40000 from org.apache.hadoop.hive.ql.ddl.DDLTask. org.apache.iceberg.exceptions.CommitStateUnknownException: MetaException(message:Permission denied: user [db2iuser] does not have [RWSTORAGE] privilege on [null://null])
Cannot determine whether the commit was successful or not, the underlying data files may or may not be needed. Manual intervention via the Remove Orphan Files Action can remove these files when a connection to the Catalog can be re-established if the commit was actually unsuccessful.
Please check to see whether or not your commit was successful before retrying this commit. Retrying an already successful operation will result in duplicate records or unintentional modifications.
At this time no files will be deleted including possibly unused manifest lists.
org.apache.hadoop.hive.ql.processors.CommandProcessorException: null
It seems like very similar to issue reported in Hive 4 which iceberg integrated. https://issues.apache.org/jira/browse/HIVE-25964
By looking into Ranger access log. Hive SQL policy for DB and Table been triggered and allowed. So something might be related with HiveIcebergStorageHandler the URI for Auth for RWSTORAGE
@yikuanlee could you attach the full stack trace of the error please?
2022-09-14T13:15:15,949 WARN com.ibm.biginsights.catalog.translator.hive.HiveExceptionTranslator [Thread-21] : [BSL-0-33c23b26d]: Exception: FAILED: Execution Error, return code 40000 from org.apache.hadoop.hive.ql.ddl.DDLTask. org.apache.iceberg.exceptions.CommitStateUnknownException: MetaException(message:Permission denied: user [db2iuser] does not have [RWSTORAGE] privilege on [null://null])
Cannot determine whether the commit was successful or not, the underlying data files may or may not be needed. Manual intervention via the Remove Orphan Files Action can remove these files when a connection to the Catalog can be re-established if the commit was actually unsuccessful.
Please check to see whether or not your commit was successful before retrying this commit. Retrying an already successful operation will result in duplicate records or unintentional modifications.
At this time no files will be deleted including possibly unused manifest lists.
org.apache.hadoop.hive.ql.processors.CommandProcessorException: null
at org.apache.hadoop.hive.ql.DriverUtils.createProcessorException(DriverUtils.java:140) ~[hive-exec-3.1.3000.7.1.7.0-8-IBM.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.ql.Executor.handleTaskFailure(Executor.java:399) ~[hive-exec-3.1.3000.7.1.7.0-8-IBM.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.ql.Executor.handleFinished(Executor.java:282) ~[hive-exec-3.1.3000.7.1.7.0-8-IBM.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:247) ~[hive-exec-3.1.3000.7.1.7.0-8-IBM.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) ~[hive-exec-3.1.3000.7.1.7.0-8-IBM.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:742) ~[hive-exec-3.1.3000.7.1.7.0-8-IBM.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:497) ~[hive-exec-3.1.3000.7.1.7.0-8-IBM.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:486) ~[hive-exec-3.1.3000.7.1.7.0-8-IBM.jar:3.1.3000.7.1.7.0-551]
@szlta, @lcspinter any thoughts?
Not sure it related, the table created successfully had InputFormat and OutputFormat properties
0: jdbc:hive2://cms.dv.net:2181/default> describe formatted t11_iceberg;
INFO : Compiling command(queryId=hive_20220922133753_360c62be-8f68-455a-a27c-e553988ab90f): describe formatted t11_iceberg
DEBUG : Encoding valid txns info 144:9223372036854775807::82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,116,118,119,120,126,129,130,131,136,141,142 txnid:144
INFO : Semantic Analysis Completed (retrial = false)
INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:col_name, type:string, comment:from deserializer), FieldSchema(name:data_type, type:string, comment:from deserializer), FieldSchema(name:comment, type:string, comment:from deserializer)], properties:null)
INFO : Completed compiling command(queryId=hive_20220922133753_360c62be-8f68-455a-a27c-e553988ab90f); Time taken: 0.074 seconds
INFO : Executing command(queryId=hive_20220922133753_360c62be-8f68-455a-a27c-e553988ab90f): describe formatted t11_iceberg
INFO : Starting task [Stage-0:DDL] in serial mode
DEBUG : Task getting executed using mapred tag : hive_20220922133753_360c62be-8f68-455a-a27c-e553988ab90f,userid=hive
INFO : Completed executing command(queryId=hive_20220922133753_360c62be-8f68-455a-a27c-e553988ab90f); Time taken: 0.275 seconds
INFO : OK
DEBUG : Shutting down query describe formatted t11_iceberg
+-------------------------------+----------------------------------------------------+----------------------------------------------------+
| col_name | data_type | comment |
+-------------------------------+----------------------------------------------------+----------------------------------------------------+
| a | int | from deserializer |
| b | int | from deserializer |
| | NULL | NULL |
| # Detailed Table Information | NULL | NULL |
| Database: | db2iuser | NULL |
| OwnerType: | USER | NULL |
| Owner: | hive | NULL |
| CreateTime: | Thu Sep 22 12:59:53 UTC 2022 | NULL |
| LastAccessTime: | UNKNOWN | NULL |
| Retention: | 0 | NULL |
| Location: | hdfs://cms.dv.net:8020/warehouse/tablespace/external/hive/db2iuser.db/T11_ICEBERG | NULL |
| Table Type: | EXTERNAL_TABLE | NULL |
| Table Parameters: | NULL | NULL |
| | EXTERNAL | TRUE |
| | TRANSLATED_TO_EXTERNAL | TRUE |
| | bucketing_version | 2 |
| | current-schema | {\"type\":\"struct\",\"schema-id\":0,\"fields\":[{\"id\":1,\"name\":\"a\",\"required\":false,\"type\":\"int\"},{\"id\":2,\"name\":\"b\",\"required\":false,\"type\":\"int\"}]} |
| | engine.hive.enabled | true |
| | external.table.purge | TRUE |
| | metadata_location | hdfs://cms.dv.net:8020/warehouse/tablespace/external/hive/db2iuser.db/T11_ICEBERG/metadata/00000-bc11bf8a-f55d-4ccd-b1e2-f28ec7f988d0.metadata.json |
| | numFiles | 0 |
| | numRows | 0 |
| | rawDataSize | 0 |
| | snapshot-count | 0 |
| | storage_handler | org.apache.iceberg.mr.hive.HiveIcebergStorageHandler |
| | table_type | ICEBERG |
| | totalSize | 0 |
| | transient_lastDdlTime | 1663851593 |
| | uuid | 441d50a9-6c8a-4fa7-bedd-cf24ceed3e0f |
| | NULL | NULL |
| # Storage Information | NULL | NULL |
| SerDe Library: | org.apache.iceberg.mr.hive.HiveIcebergSerDe | NULL |
| InputFormat: | org.apache.iceberg.mr.hive.HiveIcebergInputFormat | NULL |
| OutputFormat: | org.apache.iceberg.mr.hive.HiveIcebergOutputFormat | NULL |
| Compressed: | No | NULL |
| Num Buckets: | 0 | NULL |
| Bucket Columns: | [] | NULL |
| Sort Columns: | [] | NULL |
+-------------------------------+----------------------------------------------------+----------------------------------------------------+
38 rows selected (0.415 seconds)
but failed one those properties were null
0: jdbc:hive2://cms.dv.net:2181/default> describe formatted t21_iceberg;
INFO : Compiling command(queryId=hive_20220922133713_f0803f51-f47a-44cb-b4ef-8ef4d920885e): describe formatted t21_iceberg
DEBUG : Encoding valid txns info 143:9223372036854775807::82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,116,118,119,120,126,129,130,131,136,141,142 txnid:143
INFO : Semantic Analysis Completed (retrial = false)
INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:col_name, type:string, comment:from deserializer), FieldSchema(name:data_type, type:string, comment:from deserializer), FieldSchema(name:comment, type:string, comment:from deserializer)], properties:null)
INFO : Completed compiling command(queryId=hive_20220922133713_f0803f51-f47a-44cb-b4ef-8ef4d920885e); Time taken: 0.139 seconds
INFO : Executing command(queryId=hive_20220922133713_f0803f51-f47a-44cb-b4ef-8ef4d920885e): describe formatted t21_iceberg
INFO : Starting task [Stage-0:DDL] in serial mode
DEBUG : Task getting executed using mapred tag : hive_20220922133713_f0803f51-f47a-44cb-b4ef-8ef4d920885e,userid=hive
INFO : Completed executing command(queryId=hive_20220922133713_f0803f51-f47a-44cb-b4ef-8ef4d920885e); Time taken: 1.325 seconds
INFO : OK
DEBUG : Shutting down query describe formatted t21_iceberg
+-------------------------------+----------------------------------------------------+----------------------------------------------------+
| col_name | data_type | comment |
+-------------------------------+----------------------------------------------------+----------------------------------------------------+
| a | int | from deserializer |
| b | int | from deserializer |
| | NULL | NULL |
| # Detailed Table Information | NULL | NULL |
| Database: | db2iuser | NULL |
| OwnerType: | USER | NULL |
| Owner: | db2iuser | NULL |
| CreateTime: | Thu Sep 22 13:20:20 UTC 2022 | NULL |
| LastAccessTime: | UNKNOWN | NULL |
| Retention: | 0 | NULL |
| Location: | hdfs://cms.dv.net:8020/warehouse/tablespace/external/hive/db2iuser.db/T21_ICEBERG | NULL |
| Table Type: | EXTERNAL_TABLE | NULL |
| Table Parameters: | NULL | NULL |
| | COLUMN_STATS_ACCURATE | {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"a\":\"true\",\"b\":\"true\"}} |
| | EXTERNAL | TRUE |
| | biginsights.sql.metadata | {\"v\":1,\"source\":\"BIGSQL\",\"version\":\"4.0\"} |
| | bucketing_version | 2 |
| | external.table.purge | true |
| | numFiles | 0 |
| | numRows | 0 |
| | rawDataSize | 0 |
| | storage_handler | org.apache.iceberg.mr.hive.HiveIcebergStorageHandler |
| | table_type | ICEBERG |
| | totalSize | 0 |
| | transient_lastDdlTime | 1663852820 |
| | NULL | NULL |
| # Storage Information | NULL | NULL |
| SerDe Library: | org.apache.iceberg.mr.hive.HiveIcebergSerDe | NULL |
| InputFormat: | null | NULL |
| OutputFormat: | null | NULL |
| Compressed: | No | NULL |
| Num Buckets: | -1 | NULL |
| Bucket Columns: | [] | NULL |
| Sort Columns: | [] | NULL |
| Storage Desc Params: | NULL | NULL |
| | serialization.format | 1 |
+-------------------------------+----------------------------------------------------+----------------------------------------------------+
36 rows selected (1.557 seconds)
Also, we can see Hive metastore log show iceberg table report AllRequestedHiveResources={db2iuser/t32_iceberg; null://null; } not sure why iceberg won't return full hdfs directory but null://null instead.
2022-09-22 17:30:48,009 DEBUG org.apache.ranger.perf.resourcetrie.retrieval: [pool-5-thread-17]: [PERF] RangerPolicyRepository.getLikelyMatchEvaluators(resource=null://null): 0
2022-09-22 17:30:48,009 DEBUG org.apache.ranger.plugin.policyengine.RangerPolicyRepository: [pool-5-thread-17]: <== RangerPolicyRepository.getLikelyMatchPolicyEvaluators(null://null): evaluatorCount=0
2022-09-22 17:30:48,009 DEBUG org.apache.ranger.plugin.policyengine.RangerPolicyRepository: [pool-5-thread-17]: ==> RangerPolicyRepository.storeAuditEnabledInCache()
2022-09-22 17:30:48,010 DEBUG org.apache.ranger.plugin.policyengine.RangerPolicyRepository: [pool-5-thread-17]: <== RangerPolicyRepository.storeAuditEnabledInCache()
2022-09-22 17:30:48,010 DEBUG org.apache.ranger.plugin.policyengine.RangerPolicyEngineImpl: [pool-5-thread-17]: <== RangerPolicyEngineImpl.evaluatePoliciesNoAudit(RangerAccessRequestImpl={resource={RangerResourceImpl={ownerUser={db2iuser} elements={} }} accessType={rwstorage} user={db2iuser} userGroups={hive db2iuser } userRoles={} accessTime={Thu Sep 22 17:30:47 UTC 2022} clientIPAddress={10.89.0.3} forwardedAddresses={} remoteIPAddress={10.89.0.3} clientType={HIVEMETASTORE} action={ALTERTABLE_ADDCOLS} requestData={alter table t32_iceberg} sessionId={HiveMetaStore} resourceMatchingScope={SELF} clusterName={DV} clusterType={} context={REQUESTED_RESOURCES={AllRequestedHiveResources={db2iuser/t32_iceberg; null://null; } } token:OWNER={db2iuser} token:USER={db2iuser} } }, policyType =0, zoneName=null): RangerAccessResult={isAccessDetermined={true} isAllowed={false} isAuditedDetermined={false} isAudited={false} auditLogId={null} policyType={0} policyId={-1} zoneName={null} auditPolicyId={-1} policyVersion={null} evaluatedPoliciesCount={0} reason={null} additionalInfo={}}
2022-09-22 17:30:48,010 DEBUG org.apache.ranger.plugin.policyengine.RangerPolicyEngineImpl: [pool-5-thread-17]: <== RangerPolicyEngineImpl.zoneAwareAccessEvaluationWithNoAudit(RangerAccessRequestImpl={resource={RangerResourceImpl={ownerUser={db2iuser} elements={} }} accessType={rwstorage} user={db2iuser} userGroups={hive db2iuser } userRoles={} accessTime={Thu Sep 22 17:30:47 UTC 2022} clientIPAddress={10.89.0.3} forwardedAddresses={} remoteIPAddress={10.89.0.3} clientType={HIVEMETASTORE} action={ALTERTABLE_ADDCOLS} requestData={alter table t32_iceberg} sessionId={HiveMetaStore} resourceMatchingScope={SELF} clusterName={DV} clusterType={} context={REQUESTED_RESOURCES={AllRequestedHiveResources={db2iuser/t32_iceberg; null://null; } } token:OWNER={db2iuser} token:USER={db2iuser} } }, policyType =0): RangerAccessResult={isAccessDetermined={true} isAllowed={false} isAuditedDetermined={false} isAudited={false} auditLogId={null} policyType={0} policyId={-1} zoneName={null} auditPolicyId={-1} policyVersion={null} evaluatedPoliciesCount={0} reason={null} additionalInfo={}}
2022-09-22 17:30:48,011 DEBUG org.apache.ranger.perf.policyengine.request: [pool-5-thread-17]: [PERF] RangerPolicyEngine.evaluatePolicies(requestHashCode=53a8da6e_0): 4
2022-09-22 17:30:48,011 DEBUG org.apache.ranger.plugin.policyengine.RangerPolicyEngineImpl: [pool-5-thread-17]: <== RangerPolicyEngineImpl.evaluatePolicies(RangerAccessRequestImpl={resource={RangerResourceImpl={ownerUser={db2iuser} elements={} }} accessType={rwstorage} user={db2iuser} userGroups={hive db2iuser } userRoles={} accessTime={Thu Sep 22 17:30:47 UTC 2022} clientIPAddress={10.89.0.3} forwardedAddresses={} remoteIPAddress={10.89.0.3} clientType={HIVEMETASTORE} action={ALTERTABLE_ADDCOLS} requestData={alter table t32_iceberg} sessionId={HiveMetaStore} resourceMatchingScope={SELF} clusterName={DV} clusterType={} context={REQUESTED_RESOURCES={AllRequestedHiveResources={db2iuser/t32_iceberg; null://null; } } token:OWNER={db2iuser} token:USER={db2iuser} } }, policyType=0): RangerAccessResult={isAccessDetermined={true} isAllowed={false} isAuditedDetermined={false} isAudited={false} auditLogId={null} policyType={0} policyId={-1} zoneName={null} auditPolicyId={-1} policyVersion={null} evaluatedPoliciesCount={0} reason={null} additionalInfo={}}
2022-09-22 17:30:48,011 DEBUG org.apache.ranger.plugin.policyengine.RangerPolicyEngineImpl: [pool-5-thread-17]: ==> RangerPolicyEngineImpl.evaluateAuditPolicies(result=RangerAccessResult={isAccessDetermined={true} isAllowed={false} isAuditedDetermined={false} isAudited={false} auditLogId={null} policyType={0} policyId={-1} zoneName={null} auditPolicyId={-1} policyVersion={null} evaluatedPoliciesCount={0} reason={null} additionalInfo={}})
2022-09-22 17:30:48,011 DEBUG org.apache.ranger.plugin.policyengine.RangerPolicyEngineImpl: [pool-5-thread-17]: ==> RangerPolicyEngineImpl.evaluateTagAuditPolicies(request=RangerAccessRequestImpl={resource={RangerResourceImpl={ownerUser={db2iuser} elements={} }} accessType={rwstorage} user={db2iuser} userGroups={hive db2iuser } userRoles={} accessTime={Thu Sep 22 17:30:47 UTC 2022} clientIPAddress={10.89.0.3} forwardedAddresses={} remoteIPAddress={10.89.0.3} clientType={HIVEMETASTORE} action={ALTERTABLE_ADDCOLS} requestData={alter table t32_iceberg} sessionId={HiveMetaStore} resourceMatchingScope={SELF} clusterName={DV} clusterType={} context={REQUESTED_RESOURCES={AllRequestedHiveResources={db2iuser/t32_iceberg; null://null; } } token:OWNER={db2iuser} token:USER={db2iuser} } }, result=RangerAccessResult={isAccessDetermined={true} isAllowed={false} isAuditedDetermined={false} isAudited={false} auditLogId={null} policyType={0} policyId={-1} zoneName={null} auditPolicyId={-1} policyVersion={null} evaluatedPoliciesCount={0} reason={null} additionalInfo={}})
2022-09-22 17:30:48,012 DEBUG org.apache.ranger.plugin.policyengine.RangerPolicyEngineImpl: [pool-5-thread-17]: <== RangerPolicyEngineImpl.evaluateTagAuditPolicies(request=RangerAccessRequestImpl={resource={RangerResourceImpl={ownerUser={db2iuser} elements={} }} accessType={rwstorage} user={db2iuser} userGroups={hive db2iuser } userRoles={} accessTime={Thu Sep 22 17:30:47 UTC 2022} clientIPAddress={10.89.0.3} forwardedAddresses={} remoteIPAddress={10.89.0.3} clientType={HIVEMETASTORE} action={ALTERTABLE_ADDCOLS} requestData={alter table t32_iceberg} sessionId={HiveMetaStore} resourceMatchingScope={SELF} clusterName={DV} clusterType={} context={REQUESTED_RESOURCES={AllRequestedHiveResources={db2iuser/t32_iceberg; null://null; } } token:OWNER={db2iuser} token:USER={db2iuser} } }, result=RangerAccessResult={isAccessDetermined={true} isAllowed={false} isAuditedDetermined={false} isAudited={false} auditLogId={null} policyType={0} policyId={-1} zoneName={null} auditPolicyId={-1} policyVersion={null} evaluatedPoliciesCount={0} reason={null} additionalInfo={}})
2022-09-22 17:30:48,012 DEBUG org.apache.ranger.plugin.policyengine.RangerPolicyEngineImpl: [pool-5-thread-17]: ==> RangerPolicyEngineImpl.evaluateResourceAuditPolicies(request=RangerAccessRequestImpl={resource={RangerResourceImpl={ownerUser={db2iuser} elements={} }} accessType={rwstorage} user={db2iuser} userGroups={hive db2iuser } userRoles={} accessTime={Thu Sep 22 17:30:47 UTC 2022} clientIPAddress={10.89.0.3} forwardedAddresses={} remoteIPAddress={10.89.0.3} clientType={HIVEMETASTORE} action={ALTERTABLE_ADDCOLS} requestData={alter table t32_iceberg} sessionId={HiveMetaStore} resourceMatchingScope={SELF} clusterName={DV} clusterType={} context={REQUESTED_RESOURCES={AllRequestedHiveResources={db2iuser/t32_iceberg; null://null; } } token:OWNER={db2iuser} token:USER={db2iuser} } }, result=RangerAccessResult={isAccessDetermined={true} isAllowed={false} isAuditedDetermined={false} isAudited={false} auditLogId={null} policyType={0} policyId={-1} zoneName={null} auditPolicyId={-1} policyVersion={null} evaluatedPoliciesCount={0} reason={null} additionalInfo={}})
2022-09-22 17:30:48,012 DEBUG org.apache.ranger.perf.resourcetrie.retrieval: [pool-5-thread-17]: [PERF] RangerPolicyRepository.getLikelyMatchEvaluators(resource=null://null): 0
2022-09-22 17:30:48,012 DEBUG org.apache.ranger.plugin.policyengine.RangerPolicyRepository: [pool-5-thread-17]: <== RangerPolicyRepository.getLikelyMatchPolicyEvaluators(null://null): evaluatorCount=0
2022-09-22 17:30:48,012 DEBUG org.apache.ranger.plugin.policyengine.RangerPolicyEngineImpl: [pool-5-thread-17]: <== RangerPolicyEngineImpl.evaluateResourceAuditPolicies(request=RangerAccessRequestImpl={resource={RangerResourceImpl={ownerUser={db2iuser} elements={} }} accessType={rwstorage} user={db2iuser} userGroups={hive db2iuser } userRoles={} accessTime={Thu Sep 22 17:30:47 UTC 2022} clientIPAddress={10.89.0.3} forwardedAddresses={} remoteIPAddress={10.89.0.3} clientType={HIVEMETASTORE} action={ALTERTABLE_ADDCOLS} requestData={alter table t32_iceberg} sessionId={HiveMetaStore} resourceMatchingScope={SELF} clusterName={DV} clusterType={} context={REQUESTED_RESOURCES={AllRequestedHiveResources={db2iuser/t32_iceberg; null://null; } } token:OWNER={db2iuser} token:USER={db2iuser} } }, result=RangerAccessResult={isAccessDetermined={true} isAllowed={false} isAuditedDetermined={false} isAudited={false} auditLogId={null} policyType={0} policyId={-1} zoneName={null} auditPolicyId={-1} policyVersion={null} evaluatedPoliciesCount={0} reason={null} additionalInfo={}}): ret=false
2022-09-22 17:30:48,012 DEBUG org.apache.ranger.plugin.policyengine.RangerPolicyEngineImpl: [pool-5-thread-17]: <== RangerPolicyEngineImpl.evaluateAuditPolicies(result=RangerAccessResult={isAccessDetermined={true} isAllowed={false} isAuditedDetermined={false} isAudited={false} auditLogId={null} policyType={0} policyId={-1} zoneName={null} auditPolicyId={-1} policyVersion={null} evaluatedPoliciesCount={0} reason={null} additionalInfo={}})
2022-09-22 17:30:48,039 DEBUG org.apache.ranger.plugin.audit.RangerDefaultAuditHandler: [pool-5-thread-17]: ==> RangerDefaultAuditHandler.logAuthzAudit(AuthzAuditEvent{repositoryType=3;repositoryName=cm_hive;user=db2iuser;eventTime=Thu Sep 22 17:30:47 UTC 2022;accessType=ALTER;resourcePath=db2iuser/t32_iceberg;resourceType=@table;action=alter;accessResult=1;agentId=hiveMetastore;policyId=8;resultReason=null;aclEnforcer=ranger-acl;sessionId=HiveMetaStore;clientType=HIVEMETASTORE;clientIP=10.89.0.3;requestData=alter table t32_iceberg;agentHostname=cms.dv.net;logType=RangerAudit;eventId=ab214aac-e979-4888-8052-1a243f638e07-0;seq_num=0;event_count=1;event_dur_ms=0;tags=[];clusterName=DV;zoneName=null;policyVersion=5;additionalInfo={"remote-ip-address":10.89.0.3, "forwarded-ip-addresses":[]})
2022-09-22 17:30:48,039 DEBUG org.apache.ranger.plugin.audit.RangerDefaultAuditHandler: [pool-5-thread-17]: <== RangerDefaultAuditHandler.logAuthzAudit(AuthzAuditEvent{repositoryType=3;repositoryName=cm_hive;user=db2iuser;eventTime=Thu Sep 22 17:30:47 UTC 2022;accessType=ALTER;resourcePath=db2iuser/t32_iceberg;resourceType=@table;action=alter;accessResult=1;agentId=hiveMetastore;policyId=8;resultReason=null;aclEnforcer=ranger-acl;sessionId=HiveMetaStore;clientType=HIVEMETASTORE;clientIP=10.89.0.3;requestData=alter table t32_iceberg;agentHostname=cms.dv.net;logType=RangerAudit;eventId=ab214aac-e979-4888-8052-1a243f638e07-0;seq_num=1;event_count=1;event_dur_ms=0;tags=[];clusterName=DV;zoneName=null;policyVersion=5;additionalInfo={"remote-ip-address":10.89.0.3, "forwarded-ip-addresses":[]})
2022-09-22 17:30:48,040 DEBUG org.apache.ranger.perf.hiveauth.request: [pool-5-thread-17]: [PERF] RangerHiveAuthorizer.checkPrivileges(hiveOpType=ALTERTABLE_ADDCOLS): 55
2022-09-22 17:30:48,042 ERROR org.apache.hadoop.hive.ql.security.authorization.plugin.metastore.HiveMetaStoreAuthorizer: [pool-5-thread-17]: HiveMetaStoreAuthorizer.onEvent(): failed
org.apache.hadoop.hive.metastore.api.MetaException: Permission denied: user [db2iuser] does not have [RWSTORAGE] privilege on [null://null]
at org.apache.hadoop.hive.ql.security.authorization.plugin.metastore.HiveMetaStoreAuthorizer.checkPrivileges(HiveMetaStoreAuthorizer.java:532) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.ql.security.authorization.plugin.metastore.HiveMetaStoreAuthorizer.onEvent(HiveMetaStoreAuthorizer.java:106) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(HiveMetaStore.java:3979) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:5879) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_req(HiveMetaStore.java:5825) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_232]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_232]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_232]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_232]
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at com.sun.proxy.$Proxy28.alter_table_req(Unknown Source) [?:?]
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_req.getResult(ThriftHiveMetastore.java:17439) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_req.getResult(ThriftHiveMetastore.java:17423) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_232]
at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_232]
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898) [hadoop-common-3.1.1.7.1.7.0-551.jar:?]
at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_232]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_232]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]
Finally, i think i featured out the root caused from metastore log it seems like same as https://issues.apache.org/jira/browse/HIVE-25964| which iceberg 0.14 did not implement getURIForAuth() method.
2022-09-22 17:30:47,983 ERROR org.apache.hadoop.hive.ql.security.authorization.plugin.metastore.events.AlterTableEvent: [pool-5-thread-17]: Exception occured while getting the URI from storage handler: org.apache.iceberg.mr.hive.HiveIcebergStorageHandler.getURIForAuth(java.util.Map)
java.lang.NoSuchMethodException: org.apache.iceberg.mr.hive.HiveIcebergStorageHandler.getURIForAuth(java.util.Map)
at java.lang.Class.getMethod(Class.java:1786) ~[?:1.8.0_232]
at org.apache.hadoop.hive.ql.security.authorization.plugin.metastore.events.AlterTableEvent.getOutputHObjs(AlterTableEvent.java:127) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.ql.security.authorization.plugin.metastore.events.AlterTableEvent.getAuthzContext(AlterTableEvent.java:63) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.ql.security.authorization.plugin.metastore.HiveMetaStoreAuthorizer.buildAuthzContext(HiveMetaStoreAuthorizer.java:448) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.ql.security.authorization.plugin.metastore.HiveMetaStoreAuthorizer.onEvent(HiveMetaStoreAuthorizer.java:105) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(HiveMetaStore.java:3979) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:5879) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_req(HiveMetaStore.java:5825) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_232]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_232]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_232]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_232]
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at com.sun.proxy.$Proxy28.alter_table_req(Unknown Source) [?:?]
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_req.getResult(ThriftHiveMetastore.java:17439) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_req.getResult(ThriftHiveMetastore.java:17423) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_232]
at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_232]
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898) [hadoop-common-3.1.1.7.1.7.0-551.jar:?]
at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_232]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_232]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]
@yikuanlee it seems that https://issues.apache.org/jira/browse/HIVE-25964 was fixed in Hive 4.0.0 but Iceberg currently works with Hive 3.1.3
i am going to implement getURIForAuth() method to HiveIcebergStorageHandler.java and test on Ranger. If it works ( can create an iceberg table with Ranger policy ) . i am going to create a PR against main branch.
@nastra ,
I think i figured out why, IBM patched Cloudera hive 3.1.3 version with newer ql version which eventually included
ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageAuthorizationHandler.java
which came from
https://github.com/apache/hive/pull/1960/files#diff-8527b42317c96d8d571df7271740fd2d233ebd14d9b2d71f547525ca2bd3ff9b
Therefore When Hive alter/create event occurred it will triggered getURIForAuth() from storageHandler which iceberg 0.14.0 did not have
Noted that the getURIForAuth parameter interface in that pull request is
public URI getURIForAuth(Map<String, String> tableProperties) throws URISyntaxException;
different to current hive 4
public URI getURIForAuth(org.apache.hadoop.hive.metastore.api.Table hmsTable) throws URISyntaxException
So a lot of mismatch here. I'll close this issue as it is not really the iceberg 0.14 issue at all.