streaming-at-scale
streaming-at-scale copied to clipboard
Databricks function adx
Implement a stream processing architecture using:
- Azure data lake as input data source
- Azure Databricks structure streaming using auto loader for file sink
- Use Azure databricks job for data split scenario to split data for multi-tenant scenario
- Use Azure function for data ingestion into ADX
- View and query data in ADX
This architecture is design for huge data scenario. In this case, we split the data pre-process outside of adx for performance consideration
Data schema for this case is based on existing column but added a column :"companyId" for multiple tenant scenario
this pr also includes modify some existing modules for custom arm template scenario
There are existing, well-tested scripts in the solution for generating SPs and storing them in AKV, consider reusing them
Please reopen after addressing comments.