Purview-ADB-Lineage-Solution-Accelerator icon indicating copy to clipboard operation
Purview-ADB-Lineage-Solution-Accelerator copied to clipboard

Lineage not getting displayed for all tables

Open mithun1979 opened this issue 1 year ago • 1 comments

Lineage became visible for a table on the first run. However, its no longer changing/updating after including additional notebooks tables. The code does a simple CTAS; CREATE TABLE <TABLE_NAME> USING DELTA AS SELECT * from <SOURCE_TABLE_NAME>

The source Table is in ADLS Gen2. The Target table is a managed table in DBFS (Databricks Default Database).

Expected behavior New Lineage information should show up in Purview Logs PurviewOut.log

OpenLineageIn.log

In PurviewOut.log, there is an error: Information 2023-05-24 10:00:39.049 Error Loading to Purview JSON Entiitesto Purview: Return Code: BadRequest - Reason:Bad Request Error 2023-05-24 10:00:39.049 Purview Publish Entity Metadata Error : Error :{"requestId":"fc68faa4-73c4-4808-a77b-2fe96f65546e","errorCode":"ATLAS-400-00-036","errorMessage":"invalid relationshipDef: process_dataset_outputs: end type 1: databricks_process, end type 2: databricks_notebook"} Error 2023-05-24 10:00:40.128 Executed 'Functions.PurviewOut' (Succeeded, Id=0783df86-0011-480e-90c2-1c3660514b4d, Duration=4766ms) Information

Screenshots NA

Desktop (please complete the following information):

  • OS: Windows
  • OpenLineage Version: openlineage-spark-0.18.0.jar
  • Databricks Runtime Version: 11.3
  • Cluster Type: Interactive
  • Cluster Mode: No Isolation Shared
  • Using Credential Passthrough: No

Additional context The Lineage data showed up the first time. So the setup seems to be good. It seems there is a ATLAS error in the PurviewOut.logs

mithun1979 avatar May 24 '23 11:05 mithun1979

@mithun1979 it looks like the input is okay but the output is pointing to /user/hive/warehouse/test_call_center_schema_chnaged and is not mapping correctly to hive metastore. It unfortunately is finding a databricks_notebook from the search results and trying to map the hive table to the first object that Purview search turns up as a match.

Support for Delta is limited but we will try to get better support for this.

wjohnson avatar Dec 30 '23 07:12 wjohnson