streaming-at-scale Databricks function adx

Databricks function adx

Open chingzchen opened this issue 3 years ago • 1 comments

Implement a stream processing architecture using:

Azure data lake as input data source
Azure Databricks structure streaming using auto loader for file sink
Use Azure databricks job for data split scenario to split data for multi-tenant scenario
Use Azure function for data ingestion into ADX
View and query data in ADX

This architecture is design for huge data scenario. In this case, we split the data pre-process outside of adx for performance consideration

Data schema for this case is based on existing column but added a column :"companyId" for multiple tenant scenario

this pr also includes modify some existing modules for custom arm template scenario

Apr 27 '21 08:04 chingzchen

There are existing, well-tested scripts in the solution for generating SPs and storing them in AKV, consider reusing them

May 04 '21 12:05 algattik

Please reopen after addressing comments.

Jul 12 '23 08:07 algattik