[Bug] The premise of writing Spark SQL lineage to atlas is that the metadata of the related hive table must exist
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Search before asking
- [X] I have searched in the issues and found no similar issues.
Describe the bug
The premise of writing Spark SQL lineage to atlas is that the metadata of the related hive table must exist,If it is not written in advance, an error will be reported
3/08/31 10:30:20 WARN AtlasLineageDispatcher: Send lineage to atlas failed.
org.apache.atlas.AtlasServiceException: Metadata service API org.apache.atlas.AtlasClientV2$API_V2@55515868 failed with status 404 (Not Found) Response Body ({"errorCode":"ATLAS-404-00-00A","errorMessage":"Referenced entity AtlasObjectId{guid='null',
Affects Version(s)
master
Kyuubi Server Log Output
3/08/31 10:30:20 WARN AtlasLineageDispatcher: Send lineage to atlas failed.
org.apache.atlas.AtlasServiceException: Metadata service API org.apache.atlas.AtlasClientV2$API_V2@55515868 failed with status 404 (Not Found) Response Body ({"errorCode":"ATLAS-404-00-00A","errorMessage":"Referenced entity AtlasObjectId{guid='null', typeName='hive_table', uniqueAttributes={qualifiedName:ods.test4@primary}} is not found"})
at org.apache.atlas.AtlasBaseClient.callAPIWithResource(AtlasBaseClient.java:427)
at org.apache.atlas.AtlasBaseClient.callAPIWithResource(AtlasBaseClient.java:352)
at org.apache.atlas.AtlasBaseClient.callAPI(AtlasBaseClient.java:228)
at org.apache.atlas.AtlasClientV2.createEntities(AtlasClientV2.java:436)
at org.apache.kyuubi.plugin.lineage.dispatcher.atlas.AtlasRestClient.send(AtlasClient.scala:51)
at org.apache.kyuubi.plugin.lineage.dispatcher.atlas.AtlasLineageDispatcher.$anonfun$send$3(AtlasLineageDispatcher.scala:42)
at org.apache.kyuubi.plugin.lineage.dispatcher.atlas.AtlasLineageDispatcher.$anonfun$send$3$adapted(AtlasLineageDispatcher.scala:31)
at scala.Option.foreach(Option.scala:407)
at org.apache.kyuubi.plugin.lineage.dispatcher.atlas.AtlasLineageDispatcher.send(AtlasLineageDispatcher.scala:31)
at org.apache.kyuubi.plugin.lineage.SparkOperationLineageQueryExecutionListener.$anonfun$onSuccess$2(SparkOperationLineageQueryExecutionListener.scala:39)
at org.apache.kyuubi.plugin.lineage.SparkOperationLineageQueryExecutionListener.$anonfun$onSuccess$2$adapted(SparkOperationLineageQueryExecutionListener.scala:39)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at org.apache.kyuubi.plugin.lineage.SparkOperationLineageQueryExecutionListener.onSuccess(SparkOperationLineageQueryExecutionListener.scala:39)
at org.apache.spark.sql.util.ExecutionListenerBus.doPostEvent(QueryExecutionListener.scala:158)
at org.apache.spark.sql.util.ExecutionListenerBus.doPostEvent(QueryExecutionListener.scala:128)
at org.apache.spark.util.ListenerBus.postToAll(ListenerBus.scala:117)
at org.apache.spark.util.ListenerBus.postToAll$(ListenerBus.scala:101)
at org.apache.spark.sql.util.ExecutionListenerBus.postToAll(QueryExecutionListener.scala:128)
at org.apache.spark.sql.util.ExecutionListenerBus.onOtherEvent(QueryExecutionListener.scala:140)
at org.apache.spark.scheduler.SparkListenerBus.doPostEvent(SparkListenerBus.scala:100)
at org.apache.spark.scheduler.SparkListenerBus.doPostEvent$(SparkListenerBus.scala:28)
at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
at org.apache.spark.util.ListenerBus.postToAll(ListenerBus.scala:117)
at org.apache.spark.util.ListenerBus.postToAll$(ListenerBus.scala:101)
at org.apache.spark.scheduler.AsyncEventQueue.super$postToAll(AsyncEventQueue.scala:105)
at org.apache.spark.scheduler.AsyncEventQueue.$anonfun$dispatch$1(AsyncEventQueue.scala:105)
at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:23)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
at org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$dispatch(AsyncEventQueue.scala:100)
at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.$anonfun$run$1(AsyncEventQueue.scala:96)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1433)
at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.run(AsyncEventQueue.scala:96)
Kyuubi Engine Log Output
3/08/31 10:30:20 WARN AtlasLineageDispatcher: Send lineage to atlas failed.
org.apache.atlas.AtlasServiceException: Metadata service API org.apache.atlas.AtlasClientV2$API_V2@55515868 failed with status 404 (Not Found) Response Body ({"errorCode":"ATLAS-404-00-00A","errorMessage":"Referenced entity AtlasObjectId{guid='null', typeName='hive_table', uniqueAttributes={qualifiedName:ods.test4@primary}} is not found"})
Kyuubi Server Configurations
No response
Kyuubi Engine Configurations
No response
Additional context
no
Are you willing to submit PR?
- [ ] Yes. I would be willing to submit a PR with guidance from the Kyuubi community to fix.
- [X] No. I cannot submit a PR at this time.
Hello @lugela, Thanks for finding the time to report the issue! We really appreciate the community's efforts to improve Apache Kyuubi.
Thanks for reporting this issue, cc @wForget
org.apache.atlas.AtlasServiceException: Metadata service API org.apache.atlas.AtlasClientV2$API_V2@458c71eb failed with status 404 (Not Found) Response Body ({"errorCode":"ATLAS-404-00-00A","errorMessage":"Referenced entity AtlasObjectId{guid='null' this issue is still there, not be resoved