Purview-ADB-Lineage-Solution-Accelerator
Purview-ADB-Lineage-Solution-Accelerator copied to clipboard
Connector install - OpenLineage does not Initialize
Describe the bug When the DB compute cluster starts, OpenLineage doesn't initialize as expected, and no events are produced.
To Reproduce Steps to reproduce the behavior:
- Follow the Connector installation and post installation instructions.
- Upload the OpenLineage Jar into DBFS.
- Configured the all compute cluster (single user mode) with spark configuration.
- Uploaded the open lineage init script into the user workspace directory.
- Updated the init script path in the cluster with workspace as source and file path.
- Start the compute cluster.
Expected behavior By the instructions - https://github.com/OpenLineage/OpenLineage/tree/main/integration/spark/databricks#initialization-logs - we should see 3 log entries for intiatlization
"Registered listener io.openlineage.." - this one appears "OpenLineageContext: Init OpenLineageContext:" - this one is missing "AsyncEventQueue: Process of event SparkListenerApplicationStart" - this one appears
Logs
-
Please include any Spark code being ran that generates this error
spark.openlineage.version v1
spark.openlineage.namespace <adb_workspace_id>#<cluster_id>
spark.openlineage.host https://<functionapp_name>.azurewebsites.net/
spark.openlineage.url.param.code {{secrets/<scope>/<function_defualt_key}}
-
Init Scripts Path updated in the cluster: Type: Workspace File path: /Users/user_name/init_scripts/open-lineage-init-script.sh
Open Lineage script's absolute path - /Users/user_name/init_scripts/open-lineage-init-script.sh
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
- OS: Windows
- OpenLineage Version: 0.18.0
- Databricks Runtime Version: 11.3
- Cluster Type: All compute cluster
- Cluster Mode: Single User Access
- Using Credential Passthrough: N/A
Additional context Add any other context about the problem here.
@yxu1183 did you attempt to run a notebook and it failed to produce lineage?
If you're seeing the AsyncEventQueue: Process of event SparkListenerApplicationStart
you should be able to receive lineage events!