kyuubi icon indicating copy to clipboard operation
kyuubi copied to clipboard

[Bug] insert overwrite directory Permission denied

Open sohurdc opened this issue 5 months ago • 8 comments

Code of Conduct

Search before asking

  • [x] I have searched in the issues and found no similar issues.

Describe the bug

kyuubi-1.10.2 kerberos spark-3.5.6 ranger-2.4.0, when I config spark.sql.extensions = org.apache.kyuubi.plugin.spark.authz.ranger.RangerSparkExtension, then sql:

insert overwrite directory "hdfs://router/user/bdwh/tmp/2025-07-23/usertable/01" select * from bdwh.dim_bussiness limit 10;

have error below: Caused by: org.apache.kyuubi.plugin.spark.authz.AccessControlException: Permission denied: user [bdwh] does not have [write] privilege on [[/user/bdwh/tmp/2025-07-23/usertable/01, /user/bdwh/tmp/2025-07-23/usertable/01/]] at org.apache.kyuubi.plugin.spark.authz.ranger.SparkRangerAdminPlugin$.verify(SparkRangerAdminPlugin.scala:168) at org.apache.kyuubi.plugin.spark.authz.ranger.RuleAuthorization.$anonfun$checkPrivileges$4(RuleAuthorization.scala:81) at org.apache.kyuubi.plugin.spark.authz.ranger.RuleAuthorization.$anonfun$checkPrivileges$4$adapted(RuleAuthorization.scala:80) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at org.apache.kyuubi.plugin.spark.authz.ranger.RuleAuthorization.checkPrivileges(RuleAuthorization.scala:80) at org.apache.kyuubi.plugin.spark.authz.rule.Authorization.apply(Authorization.scala:34) at org.apache.kyuubi.plugin.spark.authz.rule.Authorization.apply(Authorization.scala:29) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:222) at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) at scala.collection.immutable.List.foldLeft(List.scala:91) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:219) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:211)

BUT,the same sql I use hive is normal and hadoop fs command is normal

The Kyuubi Ranger policies for Hive databases and tables are working fine, but the HDFS policies are not working at all. I've checked the Spark Ranger plugin installation and configuration but didn't find any issues. Could you please help take a look and see what might be going wrong? Thank you very much!

Affects Version(s)

1.10.2

Kyuubi Server Log Output


Kyuubi Engine Log Output


Kyuubi Server Configurations


Kyuubi Engine Configurations


Additional context

No response

Are you willing to submit PR?

  • [x] Yes. I would be willing to submit a PR with guidance from the Kyuubi community to fix.
  • [ ] No. I cannot submit a PR at this time.

sohurdc avatar Jul 23 '25 11:07 sohurdc

Hello @sohurdc, Thanks for finding the time to report the issue! We really appreciate the community's efforts to improve Apache Kyuubi.

github-actions[bot] avatar Jul 23 '25 11:07 github-actions[bot]

Possible to share screenshots of policy on ranger that you have configured?

Reactor11 avatar Aug 12 '25 12:08 Reactor11

Here is my hive policy and hdfs policy,I have add the hive and hdfs policy, But no use Image

Image

sohurdc avatar Aug 13 '25 09:08 sohurdc

you have to add the path in your hive policy - /user/bdwh/tmp/2025-07-23/usertable/01

  1. add database, table and column
  2. in the same policy you have to add the path as well

Reactor11 avatar Aug 13 '25 10:08 Reactor11

Thank you very much, you’re right — the Ranger Hive policy also needs to include the URL path, but Hive itself doesn’t require this. Is this a special permission configuration requirement of Kyuubi? Can this URL permission check be disabled? We’d like to keep Kyuubi’s permission policies consistent with Hive’s.

sohurdc avatar Aug 14 '25 01:08 sohurdc

But I have new error: set kyuubi.operation.language=SCALA; val in = spark.read.parquet("/user/bdwh/panther/dwd/dwd_panther_mr_eventlog/dt=20250331/hr=16"); in.show; then I got error bellow: "response" : "org.apache.kyuubi.plugin.spark.authz.AccessControlException: Permission denied: user [bdwh] does not have [read] privilege on [[hdfs://router/user/bdwh/panther/dwd/dwd_panther_mr_eventlog/dt=20250331/hr=16, hdfs://router/user/bdwh/panther/dwd/dwd_panther_mr_eventlog/dt=20250331/hr=16/]]\n at org.apache.kyuubi.plugin.spark.authz.ranger.SparkRangerAdminPlugin$.verify(SparkRangerAdminPlugin.scala:168)\n at org.apache.kyuubi.plugin.spark.authz.ranger.RuleAuthorization.$anonfun$checkPrivileges$4(RuleAuthorization.scala:81)\n at org.apache.kyuubi.plugin.spark.authz.ranger.RuleAuthorization.$anonfun$checkPrivileges$4$adapted(RuleAuthorization.scala:80)

I have been add url policy, but no use:

Image

sohurdc avatar Aug 14 '25 04:08 sohurdc

this is coming from your hdfs policy, please update there

"response" : "org.apache.kyuubi.plugin.spark.authz.AccessControlException: Permission denied: user [bdwh] does not have [read] privilege on [[hdfs://router/user/bdwh/panther/dwd/dwd_panther_mr_eventlog/dt=20250331/hr=16, hdfs://router/user/bdwh/panther/dwd/dwd_panther_mr_eventlog/dt=20250331/hr=16/]]\n at org.apache.kyuubi.plugin.spark.authz.ranger.SparkRangerAdminPlugin$.verify(SparkRangerAdminPlugin.scala:168)\n at org.apache.kyuubi.plugin.spark.authz.ranger.RuleAuthorization.$anonfun$checkPrivileges$4(RuleAuthorization.scala:81)\n at org.apache.kyuubi.plugin.spark.authz.ranger.RuleAuthorization.$anonfun$checkPrivileges$4$adapted(RuleAuthorization.scala:80)

Reactor11 avatar Aug 14 '25 06:08 Reactor11

hdfs policy is no use:

Image

kyuubi scala mode, set kyuubi.operation.language=SCALA; val in = spark.read.parquet("/user/bdwh/panther/dwd/dwd_panther_mr_eventlog/dt=20250331/hr=16"); in.show; It’s strange — in.count works and returns a value, but in.show throws an error saying there’s no read permission on the directory.

sohurdc avatar Aug 14 '25 06:08 sohurdc