dlt-meta
dlt-meta copied to clipboard
This is metadata driven DLT based framework for bronze/silver pipelines
- Point to source system's database/catalog and DLT-META should launch DLT pipelines to create ingestion flow - This will create DLT-META config automatically - Launch DLT pipelines
Corrected delta source table key from "table" to "source_table"
Need to provide ability so that file metadata can be added to dataframe e.g ``` import dlt @dlt.table def bronze(): return (spark.readStream.format("cloudFiles") # define the schema for the ~6 common...
Changed CLI onboard command which was using dbfs.create causing already exists issue forAzure
When using the delta option for the bronze source_format the read_dlt_delta method will fail with a key error: ``` .{bronze_dataflow_spec.sourceDetails["table"]} KeyError: 'table' ``` Switching the key to `source_table` worked for...
Following the [default docs](https://databrickslabs.github.io/dlt-meta/getting_started/dltpipelineopt1/) with default argument values, It fails with the same error, regardless the dbfs folder: `databricks.sdk.errors.platform.ResourceAlreadyExists: A file or directory already exists at the input path dbfs:/dlt-meta_cli_demo/dltmeta_conf/onboarding.json`.
Integrate append_flow API for following use cases: 1. One time backfill 2. Multiple kafka topics writing to same target [API DOCS Ref](https://docs.databricks.com/en/delta-live-tables/python-ref.html#write-to-a-streaming-table-from-multiple-source-streams)
Support non delta as sink using metadata approach. - In metadata if sink is non delta use Structure streaming approach with foreachbatch - Use DAB to deploy non-DLT pipelines to...
Integrate databricks labs [blueprint](https://github.com/databrickslabs/blueprint) into code base for CLI and unit tests
Could we add a warning to the onboarding process that would throw a warning when a table in the onboarding file is used but the corresponding silver transformation table couldn't...