Purview-ADB-Lineage-Solution-Accelerator icon indicating copy to clipboard operation
Purview-ADB-Lineage-Solution-Accelerator copied to clipboard

Connector install - OpenLineage does not Initialize

Open yxu1183 opened this issue 1 year ago • 1 comments

Describe the bug When the DB compute cluster starts, OpenLineage doesn't initialize as expected, and no events are produced.

To Reproduce Steps to reproduce the behavior:

  1. Follow the Connector installation and post installation instructions.
  2. Upload the OpenLineage Jar into DBFS.
  3. Configured the all compute cluster (single user mode) with spark configuration.
  4. Uploaded the open lineage init script into the user workspace directory.
  5. Updated the init script path in the cluster with workspace as source and file path.
  6. Start the compute cluster.

Expected behavior By the instructions - https://github.com/OpenLineage/OpenLineage/tree/main/integration/spark/databricks#initialization-logs - we should see 3 log entries for intiatlization

"Registered listener io.openlineage.." - this one appears "OpenLineageContext: Init OpenLineageContext:" - this one is missing "AsyncEventQueue: Process of event SparkListenerApplicationStart" - this one appears

Logs

  1. Please include any Spark code being ran that generates this error spark.openlineage.version v1 spark.openlineage.namespace <adb_workspace_id>#<cluster_id> spark.openlineage.host https://<functionapp_name>.azurewebsites.net/ spark.openlineage.url.param.code {{secrets/<scope>/<function_defualt_key}}

  2. Init Scripts Path updated in the cluster: Type: Workspace File path: /Users/user_name/init_scripts/open-lineage-init-script.sh

Open Lineage script's absolute path - /Users/user_name/init_scripts/open-lineage-init-script.sh

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: Windows
  • OpenLineage Version: 0.18.0
  • Databricks Runtime Version: 11.3
  • Cluster Type: All compute cluster
  • Cluster Mode: Single User Access
  • Using Credential Passthrough: N/A

Additional context Add any other context about the problem here.

yxu1183 avatar Oct 13 '23 02:10 yxu1183

@yxu1183 did you attempt to run a notebook and it failed to produce lineage?

If you're seeing the AsyncEventQueue: Process of event SparkListenerApplicationStart you should be able to receive lineage events!

wjohnson avatar Dec 30 '23 05:12 wjohnson