datahub icon indicating copy to clipboard operation
datahub copied to clipboard

dbt: Insert only dbt assertions

Open Santhin opened this issue 2 years ago • 1 comments

Describe the bug When doing ingestion with only assertions created from dbt test it's additionally ingesting metadata that is already in datahub

To Reproduce Steps to reproduce the behavior:

  1. Initial ingest
source:
  type: "dbt"
  config:
    manifest_path: "target/manifest.json"
    catalog_path: "target/catalog.json"
    sources_path: "target/sources.json"
    test_results_path: "target/run_results.json"
    target_platform: "postgres" # e.g. bigquery/postgres/etc.

sink:
  type: "datahub-rest"
  config:
    server: "http://localhost:8080"
  1. Ingestion with only assertions
source:
  type: "dbt"
  config:
    # Coordinates
    manifest_path: "target/manifest.json"
    catalog_path: "target/catalog.json"
    test_results_path: "target/run_results.json"
    
    target_platform: "postgres" # e.g. bigquery/postgres/etc.
    node_type_pattern:
      allow:
        - test

sink:
  type: "datahub-rest"
  config:
    server: "http://localhost:8080"

Expected behavior With using node_type_pattern I'm getting records_written': 18 but it should be records_written': 6

Screenshots image image

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Additional context The possibility to ingest only results of assertions in a repeatable manner rather than transferring additional data about assertions object that is already created in datahub would be neat and wouldn't send redundantly metadata.

Solution proposal:

  • add flag enable_only_assertions
  • add if statement to https://github.com/acryldata/datahub/blob/master/metadata-ingestion/src/datahub/ingestion/source/dbt.py#L1298 with flag

Santhin avatar Jul 04 '22 08:07 Santhin

This issue is stale because it has been open for 15 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io

github-actions[bot] avatar Aug 03 '22 02:08 github-actions[bot]

This issue was closed because it has been inactive for 30 days since being marked as stale.

github-actions[bot] avatar Sep 02 '22 02:09 github-actions[bot]