[Bug] insert overwrite directory Permission denied
Code of Conduct
- [x] I agree to follow this project's Code of Conduct
Search before asking
- [x] I have searched in the issues and found no similar issues.
Describe the bug
kyuubi-1.10.2 kerberos spark-3.5.6 ranger-2.4.0, when I config spark.sql.extensions = org.apache.kyuubi.plugin.spark.authz.ranger.RangerSparkExtension, then sql:
insert overwrite directory "hdfs://router/user/bdwh/tmp/2025-07-23/usertable/01" select * from bdwh.dim_bussiness limit 10;
have error below: Caused by: org.apache.kyuubi.plugin.spark.authz.AccessControlException: Permission denied: user [bdwh] does not have [write] privilege on [[/user/bdwh/tmp/2025-07-23/usertable/01, /user/bdwh/tmp/2025-07-23/usertable/01/]] at org.apache.kyuubi.plugin.spark.authz.ranger.SparkRangerAdminPlugin$.verify(SparkRangerAdminPlugin.scala:168) at org.apache.kyuubi.plugin.spark.authz.ranger.RuleAuthorization.$anonfun$checkPrivileges$4(RuleAuthorization.scala:81) at org.apache.kyuubi.plugin.spark.authz.ranger.RuleAuthorization.$anonfun$checkPrivileges$4$adapted(RuleAuthorization.scala:80) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at org.apache.kyuubi.plugin.spark.authz.ranger.RuleAuthorization.checkPrivileges(RuleAuthorization.scala:80) at org.apache.kyuubi.plugin.spark.authz.rule.Authorization.apply(Authorization.scala:34) at org.apache.kyuubi.plugin.spark.authz.rule.Authorization.apply(Authorization.scala:29) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:222) at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) at scala.collection.immutable.List.foldLeft(List.scala:91) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:219) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:211)
BUT,the same sql I use hive is normal and hadoop fs command is normal
The Kyuubi Ranger policies for Hive databases and tables are working fine, but the HDFS policies are not working at all. I've checked the Spark Ranger plugin installation and configuration but didn't find any issues. Could you please help take a look and see what might be going wrong? Thank you very much!
Affects Version(s)
1.10.2
Kyuubi Server Log Output
Kyuubi Engine Log Output
Kyuubi Server Configurations
Kyuubi Engine Configurations
Additional context
No response
Are you willing to submit PR?
- [x] Yes. I would be willing to submit a PR with guidance from the Kyuubi community to fix.
- [ ] No. I cannot submit a PR at this time.
Hello @sohurdc, Thanks for finding the time to report the issue! We really appreciate the community's efforts to improve Apache Kyuubi.
Possible to share screenshots of policy on ranger that you have configured?
Here is my hive policy and hdfs policy,I have add the hive and hdfs policy, But no use
you have to add the path in your hive policy - /user/bdwh/tmp/2025-07-23/usertable/01
- add database, table and column
- in the same policy you have to add the path as well
Thank you very much, you’re right — the Ranger Hive policy also needs to include the URL path, but Hive itself doesn’t require this. Is this a special permission configuration requirement of Kyuubi? Can this URL permission check be disabled? We’d like to keep Kyuubi’s permission policies consistent with Hive’s.
But I have new error: set kyuubi.operation.language=SCALA; val in = spark.read.parquet("/user/bdwh/panther/dwd/dwd_panther_mr_eventlog/dt=20250331/hr=16"); in.show; then I got error bellow: "response" : "org.apache.kyuubi.plugin.spark.authz.AccessControlException: Permission denied: user [bdwh] does not have [read] privilege on [[hdfs://router/user/bdwh/panther/dwd/dwd_panther_mr_eventlog/dt=20250331/hr=16, hdfs://router/user/bdwh/panther/dwd/dwd_panther_mr_eventlog/dt=20250331/hr=16/]]\n at org.apache.kyuubi.plugin.spark.authz.ranger.SparkRangerAdminPlugin$.verify(SparkRangerAdminPlugin.scala:168)\n at org.apache.kyuubi.plugin.spark.authz.ranger.RuleAuthorization.$anonfun$checkPrivileges$4(RuleAuthorization.scala:81)\n at org.apache.kyuubi.plugin.spark.authz.ranger.RuleAuthorization.$anonfun$checkPrivileges$4$adapted(RuleAuthorization.scala:80)
I have been add url policy, but no use:
this is coming from your hdfs policy, please update there
"response" : "org.apache.kyuubi.plugin.spark.authz.AccessControlException: Permission denied: user [bdwh] does not have [read] privilege on [[hdfs://router/user/bdwh/panther/dwd/dwd_panther_mr_eventlog/dt=20250331/hr=16, hdfs://router/user/bdwh/panther/dwd/dwd_panther_mr_eventlog/dt=20250331/hr=16/]]\n at org.apache.kyuubi.plugin.spark.authz.ranger.SparkRangerAdminPlugin$.verify(SparkRangerAdminPlugin.scala:168)\n at org.apache.kyuubi.plugin.spark.authz.ranger.RuleAuthorization.$anonfun$checkPrivileges$4(RuleAuthorization.scala:81)\n at org.apache.kyuubi.plugin.spark.authz.ranger.RuleAuthorization.$anonfun$checkPrivileges$4$adapted(RuleAuthorization.scala:80)
hdfs policy is no use:
kyuubi scala mode, set kyuubi.operation.language=SCALA; val in = spark.read.parquet("/user/bdwh/panther/dwd/dwd_panther_mr_eventlog/dt=20250331/hr=16"); in.show; It’s strange — in.count works and returns a value, but in.show throws an error saying there’s no read permission on the directory.