dlt-meta icon indicating copy to clipboard operation
dlt-meta copied to clipboard

Unity Catalog Lineage is not generated

Open ln-data-bass opened this issue 1 year ago • 10 comments

when we deployed this to our environment we noticed that there is no lineage generated in our UC tables.

is this expected behaviour?

my assumption is that the lineage would be generated by DLT pipelines. Is there a way to sequence the Bronze and Silver into one pipeline? would this then generate the appropriate lineage in UC?

ln-data-bass avatar Sep 06 '24 04:09 ln-data-bass

You should see linage for silver uc table pointing to bronze. Inside DLT-META we are calling DLT APIs so should work same way if anyone would do notebook based sql or python.

Is there a way to sequence the Bronze and Silver into one pipeline? A: Currently DLT-META does not support chaining bronze/silver inside single DLT pipeline.

ravi-databricks avatar Sep 06 '24 21:09 ravi-databricks

so far our observation is that the lineage is not generated for the silver table (downstream) or the bronze table (upstream). we assumed this was because there were two pipelines. any idea how to troubleshoot this?

ln-data-bass avatar Sep 09 '24 04:09 ln-data-bass

We see the same thing in our lineage. Our bronze streaming table and silver streaming table are in their own schema(bronze and silver), not sure if this breaks it or not. Here is a snapshot of one of our bronze tables with no lineage into a silver table. If you go look at the silver table's lineage you don't see it going back to the bronze table.

image

WilliamMize avatar Sep 09 '24 15:09 WilliamMize

I created branch Issue_94 to chain bronze/silver into single DLT dlt-meta-demo As of now you need to use Direct publishing mode which is in Preview channel direct_publishing_mode

Here is how dlt-meta config looks:

    "configuration": {
        "layer": "bronze_silver",
        "bronze.group": "A1",
        "silver.group": "A1",
        "bronze.dataflowspecTable": "ucname.dlt_meta_dataflowspecs_schema.bronze_dataflowspec",
        "silver.dataflowspecTable": "ucname.dlt_meta_dataflowspecs_schema.silver_dataflowspec"
    }

ravi-databricks avatar Sep 09 '24 23:09 ravi-databricks

thanks @ravi-databricks, do you think this fix (getting bronze and silver in the same DLT Pipeline) would fix the lineage not being generated? or is it just using the "direct publishing mode" that would fix the lineage issue?

we currently don't see any lineage for bronze or silver tables:

image

image

ln-data-bass avatar Sep 10 '24 00:09 ln-data-bass

It shows for silver tables pointing to downstream pipelines and for bronze there is nothing upstream. If you click on linage graph you would see streaming table. silver_linage silver_linage_graph

ravi-databricks avatar Sep 10 '24 02:09 ravi-databricks

If your silver tables are the target of APPLY CHANGES INTO, then, currently it won't show the upstream lineage as documented here

ganeshchand avatar Sep 10 '24 18:09 ganeshchand

thanks @ganeshchand for sharing. Do you expect the downstream lineage should be generated? because currently we don't see any lineage. we've followed the instructions exactly as documented and everything works as expected except that the lineage is not generated anywhere (not in the bronze or the silver).

ln-data-bass avatar Sep 11 '24 01:09 ln-data-bass

That doesn't sound normal to me. Would you be able to setup a test DLT pipeline with the same set of flows and dependencies but without using dlt-meta and see if the lineage behavior is different? This will confirm if it is an issue w/ dlt-meta or not.

ganeshchand avatar Sep 18 '24 21:09 ganeshchand

hi @ganeshchand I confirmed that when I use the APPLY CHANGES INTO API in DLT then no lineage is generated (neither upstream from bronze nor downstream from silver). we experience this when using DLT META and also when not using DLT META. Therefore this is clearly a limitation that applies to UC Lineage capability and not to DLT META (as per your link to the documentation).

When I don't use APPLY CHANGES INTO to update silver, then both downstream and upstream lineage are generated.

@ravi-databricks do you have any idea if Direct Publishing Mode will resolve this limitation in UC Lineage?

ln-data-bass avatar Sep 23 '24 07:09 ln-data-bass