dlt-meta icon indicating copy to clipboard operation
dlt-meta copied to clipboard

Quarantine data in silver layer

Open sanketkaleda opened this issue 1 year ago • 2 comments

Normally we have a requirement of applying data quality rules from bronze to silver layer, so we would need to qurantine data in silver layer, currently its not supported to add quaratine feature in silver layer. will it be available soon ?

sanketkaleda avatar Oct 15 '24 14:10 sanketkaleda

Usually bronze is entry point where customers do quarantine data and send back to source. We can introduce quarantine feature in silver too. It might be for v0.0.10 release since v.0.0.9 release is finalized

ravi-databricks avatar Oct 15 '24 17:10 ravi-databricks

Thanks for accepting it as an enhancement for v0.0.10. In our case, we need to keep bronze layer without applying any data quality rule and send the quality data in silver and quarantine the bad data in silver only. The purpose of quarantine is not to send the data back to source but to have detailed data quality monitoring of rows are rejected. handshake with source system will still be manual and done by respective data product owner.

If we quarantine the data in bronze layer itself then we will loose the idea of keeping As-Is source data in bronze.

sanketkaleda avatar Oct 16 '24 09:10 sanketkaleda

Similar thought process to what Sanket mentioned on 16-Oct-24, I want to store the good as well as bad records in bronze and apply dqe in silver layer. Eagerly waiting for v0.0.10 to be finalised.

aayrm5 avatar Jul 11 '25 00:07 aayrm5

New Silver Quarantine Table Attributes Introduced in onboarding.json:

  • silver_catalog_quarantine
  • silver_database_quarantine
  • silver_quarantine_table
  • silver_quarantine_table_properties
  • silver_quarantine_cluster

Also Added:

To run tests:

python integration_tests/run_integration_tests.py --uc_catalog_name=<<catalog_name>> --source=cloudfiles

📊 Silver Onboarding DataflowSpec for customer and transaction feeds:

Image

🔁 Silver Layer DLT for customer and transaction tables:

Image Image

@sanketkaleda @aayrm5 – Please test this feature branch and share any feedback before we merge into feature/v0.0.10 for release.

ravi-databricks avatar Jul 24 '25 21:07 ravi-databricks