Purview-ADB-Lineage-Solution-Accelerator
Purview-ADB-Lineage-Solution-Accelerator copied to clipboard
Demo Install - DB compute setup requires addition steps
Describe the bug The Demo install process requires additional steps to get running
To Reproduce Steps to reproduce the behavior:
- Update the Settings.sh to include a unique Purview name (or else hit the already exists issue)
- Run the Demo installation sh
- The DB cluster fails to start with an init script error - not found
Steps to fix:
- Update the Settings.sh to include a unique Purview name (or else hit the already exists issue)
- Run the Demo installation sh
- Using the DB CLI - upload the init script and jar (as per the connector instructions - https://github.com/microsoft/Purview-ADB-Lineage-Solution-Accelerator/blob/main/deploy-base.md#install-openlineage-on-your-databricks-cluster)
- Update DB cluster Libraries to include the Maven - mssql-connector
- update the DB "abfss-in-abfss-out-sample" - storageServiceName & storageContainerName values as the references dont work within the current set
Expected behavior A clear and concise description of what you expected to happen.
The Demo installation should work as per the instructions.
Logs
- Please include any Spark code being ran that generates this error
https://gist.github.com/batemansogq/c29c2bcfb04b3e966fad5ac4648feb1d = Spark config
- Please include a gist to the OpenLineageIn and PurviewOut logs
- See how to stream Azure Function Logs
Screenshots If applicable, add screenshots to help explain your problem.
demo install - DB failure
demo install - settings update
demo install - spark config
Desktop (please complete the following information): - I have the standard MS build
OS: [e.g. Windows, Mac] OpenLineage Version: [e.g. name of jar] Databricks Runtime Version: [e.g. 9.1, 10.1, 11.3] Cluster Type: [e.g. Job, Interactive] Cluster Mode: [e.g. Standard, High Concurrency, Single] Using Credential Passthrough: [e.g. Yes, No] Additional context Add any other context about the problem here.
This work has been completed in the MS non-prod tenancy, reach out to me via email for access.
Hi, @batemansogq - Thank you for using the solution accelerator! Would you help me understand what you believe fixed your cluster start issue?
It looks like you took three steps:
- Added purviewName to the settings.sh
- Update DB cluster Libraries to include the Maven - mssql-connector
- update the DB "abfss-in-abfss-out-sample" - storageServiceName & storageContainerName values as the references dont work within the current set
However, I'm not certain how these would have affected your cluster initialization.
Thank you for any additional feedback.
Hey Will,
This covers all of the steps I needed to do, to complete the install and get the cluster running & DB notebool running - asides from uploading the files to DBFS. But as you say, these arent directly related to getting the DB cluster running
- Added purviewName to the settings.sh = This was needed to get the install of purview, otherwise it kept failing from name exists errors not matter what region I tried
- Update DB cluster Libraries to include the Maven - mssql-connector = needed to match the install script (cluster ran without this)
- update the DB "abfss-in-abfss-out-sample" = needed to run the notebook (cluster ran without this)
Apologies, I was trying to get a single issue to cover all of the instruction updates required, rather than raising 4 seperate issues (=lazy)
Regards Scott
From: Will Johnson @.> Sent: Thursday, 8 June 2023 11:06 PM To: microsoft/Purview-ADB-Lineage-Solution-Accelerator @.> Cc: batemansogq @.>; Mention @.> Subject: Re: [microsoft/Purview-ADB-Lineage-Solution-Accelerator] Demo Install - DB compute setup requires addition steps (Issue #206)
Hi, @batemansogqhttps://github.com/batemansogq - Thank you for using the solution accelerator! Would you help me understand what you believe fixed your cluster start issue?
It looks like you took three steps:
- Added purviewName to the settings.sh
- Update DB cluster Libraries to include the Maven - mssql-connector
- update the DB "abfss-in-abfss-out-sample" - storageServiceName & storageContainerName values as the references dont work within the current set
However, I'm not certain how these would have affected your cluster initialization.
Thank you for any additional feedback.
— Reply to this email directly, view it on GitHubhttps://github.com/microsoft/Purview-ADB-Lineage-Solution-Accelerator/issues/206#issuecomment-1582548532, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADXMAAUHJ7GGPL3G3LYGIXDXKHE7DANCNFSM6AAAAAAY5E44JE. You are receiving this because you were mentioned.Message ID: @.***>