OpenMetadata icon indicating copy to clipboard operation
OpenMetadata copied to clipboard

Ingestion: Clickhouse + dbt incorrect process column lineage

Open CapitanHeMo opened this issue 1 year ago • 0 comments

Affected module Ingestion Framework

Describe the bug While parsing manifest data, at the step of extracting Column Lineage from "compiled_code"/"compile_sql", the error Lineage computed with SqlFluff did not perform as expected for the [clickhouse] query: [create table "schema"."schema"."table" as ... crashes.

To Reproduce

Expected behavior

Version:

  • OS: [Docker]
  • Python version: 3.11
  • OpenMetadata version: 1.3.2
  • OpenMetadata Ingestion package version: 1.3.2

Additional context

I did a little research and found that the error is due to the use of fqdn according to the logic of a regular database = {database}.{schema}.{table}, but Clickhouse does not use the schema object and parsing the generated script causes an error.

Linked file: OpenMetadata/ingestion/src/metadata/ingestion/source/database/dbt/metadata.py

...
            source_elements = fqn.split(to_entity.fullyQualifiedName.__root__)
            # remove service name from fqn to make it parseable in format db.schema.table
            query_fqn = fqn._build(  # pylint: disable=protected-access
                *source_elements[-3:]
            )
            query = (
                f"create table {query_fqn} as {data_model_link.datamodel.sql.__root__}"
            )
...

This make queries like

create table "schema"."schema"."table" as select 1

The link to the dbt-clickhouse driver where it is described what the profile value is.{env}.database must be empty or equal to schema. dbt-clickhouse

CapitanHeMo avatar Apr 19 '24 07:04 CapitanHeMo