Purview-ADB-Lineage-Solution-Accelerator
Purview-ADB-Lineage-Solution-Accelerator copied to clipboard
Lineage is not published to Purview - START events missing environment-properties
Describe the bug
START events are skipped in OpenLineageIn
function because they are missing the environment-properties
field. This causes COMPLETE events to not be processed by PurviewOut
function, therefore no lineage is published to Purview.
However, as discussed in https://github.com/OpenLineage/OpenLineage/issues/2203, this can be considered a valid scenario, because OL model is cumulative, so the following RUNNING event should have the environment-properties
information on top of that.
Expected behavior
START events, even when missing environment-properties
field should be accepted. RUNNING events should be accepted as well and used to fill the information from environment-properties
when they have it. Then, a COMPLETE event can be properly processed and lineage be published to Purview.
Environment
- OpenLineage Version: OL 1.5.0+ (I changed parameter for the newer version)
- Databricks Runtime Version: 14.3
- Cluster Type: Job