data-landing-zone icon indicating copy to clipboard operation
data-landing-zone copied to clipboard

Data Factory and Purview account are not connected

Open zeinab-mk opened this issue 3 years ago • 7 comments

After the deployment is completed, I did not see the catalogUri tag on the ADF resource and ADF connection was in Disconnected status in Azure Purview.

image

image

zeinab-mk avatar May 07 '21 05:05 zeinab-mk

That is a good point. We will add the role assignment to the ARM templates, so that Purview has automatically access. We will probably give the MSI access to the overall subscription in order to also be able to scan all kinds of data sources.

Optimally, we would add the Purview MSI as Reader to the Management Group to scan all kinds of data assets within the tenant. However, this is not something we can perform automatically without the right access rights.

Therefore, I would suggest to add it to each Landing Zone as part of the Landing Zone deployment. @mboswell any thoughts or do you agree?

marvinbuss avatar May 07 '21 08:05 marvinbuss

Same issue as https://github.com/Azure/data-landing-zone/issues/115.

marvinbuss avatar May 07 '21 15:05 marvinbuss

This actually requires to add the MSI of Data Factory as "Purview Data Curator". This is not required for Synapse. Follow-up required from my side.

marvinbuss avatar May 12 '21 17:05 marvinbuss

We will not add this for now, since SHIR and Service Principal are required anyways for scans, if all services are behind private endpoints (e.g. Purview, Synapse, Data Factory, etc.). Therefore, we will hold off for now, since this is not something that is actually required when using private endpoints end-to-end.

marvinbuss avatar Sep 06 '21 07:09 marvinbuss

https://github.com/Azure/data-landing-zone/pull/190 will add private link connectivity for ADF. Synapse does not expose private endpoints via ARM and hence we cannot automate the setup in Synapse.

marvinbuss avatar Oct 15 '21 16:10 marvinbuss

All the role assignments for Purview now have been moved into the data plane. Hence, without using self-hosted agents, we are not able to access a private Purview instance. That means that we cannot make any role assignments from ARM to a collection other than the collection Admin role assignment to the root collection. I summary, that means that all ADF and Synapse role assignments have to be executed manually today. A user has to execute this via the Purview Portal today.

marvinbuss avatar Oct 26 '21 08:10 marvinbuss

Update: I am working on Full Automation of Lineage and Data Source onboarding here: https://github.com/marvinbuss/PurviewAutomation

marvinbuss avatar Dec 23 '21 07:12 marvinbuss