OpenMetadata
OpenMetadata copied to clipboard
Ingestion: Clickhouse + dbt incorrect process column lineage
Affected module Ingestion Framework
Describe the bug
While parsing manifest data, at the step of extracting Column Lineage from "compiled_code"/"compile_sql", the error
Lineage computed with SqlFluff did not perform as expected for the [clickhouse] query: [create table "schema"."schema"."table" as ... crashes.
To Reproduce
Expected behavior
Version:
- OS: [Docker]
- Python version: 3.11
- OpenMetadata version: 1.3.2
- OpenMetadata Ingestion package version:
1.3.2
Additional context
I did a little research and found that the error is due to the use of fqdn according to the logic of a regular database = {database}.{schema}.{table}, but Clickhouse does not use the schema object and parsing the generated script causes an error.
Linked file: OpenMetadata/ingestion/src/metadata/ingestion/source/database/dbt/metadata.py
...
source_elements = fqn.split(to_entity.fullyQualifiedName.__root__)
# remove service name from fqn to make it parseable in format db.schema.table
query_fqn = fqn._build( # pylint: disable=protected-access
*source_elements[-3:]
)
query = (
f"create table {query_fqn} as {data_model_link.datamodel.sql.__root__}"
)
...
This make queries like
create table "schema"."schema"."table" as select 1
The link to the dbt-clickhouse driver where it is described what the profile value is.{env}.database must be empty or equal to schema. dbt-clickhouse