OpenMetadata icon indicating copy to clipboard operation
OpenMetadata copied to clipboard

Unable to get lineage between Postgres and Snowflake using Fivetran connector

Open fredriv opened this issue 3 years ago • 2 comments

Affected module Backend and ingestion framework

Describe the bug I've ingested tables from Postgres and Snowflake into Open Metadata. Tables are synced between Postgres and Snowflake using Fivetran.

I'm trying to use the new Fivetran connector to set up lineage between the Postgres and Snowflake tables. The connector manages to find the Fivetran pipelines, but it fails to locate the correct Snowflake table in Open Metadata since the Snowflake tables use UPPER CASE for the IDs whereas Fivetran (Postgres?) seems to use lower case. So the connector tries to look up a Snowflake table based on FQN Warehouse_Dev.raw_dev.fisk_cloudsql_public.ost which fails because it should be Warehouse_Dev.RAW_DEV.FISK_CLOUDSQL_PUBLIC.OST

To Reproduce

  • Ingest tables from Postgres and Snowflake into Open Metadata - where Snowflake database is populated with Fivetran.
    • NB! Make sure to name the Snowflake connector the same as the Fivetran destination (possibly another bug/missing config option?)
  • Set up Fivetran connector in Open Metadata to ingest the pipelines (enable debug logs)
  • See in the debug logs that Fivetran connector is unable to locate the Snowflake tables based on FQN

Expected behavior

I expected the Fivetran connector to find both the Postgres and Snowflake tables and set up lineage between them.

Version:

  • OS: MacOS
  • Python version: 3.9.9
  • OpenMetadata version: 0.12
  • OpenMetadata Ingestion package version: openmetadata-ingestion[docker]==0.12.0.2

Additional context

Excerpt from the Fivetran connector debug log:

[2022-09-21, 13:35:05 UTC] {client.py:177} DEBUG - URL http://openmetadata-server:8585/api/v1/tables/name/fisk_cloudsql.fisk.public.ost, method GET
[2022-09-21, 13:35:05 UTC] {client.py:178} DEBUG - Data {'headers': {'Content-type': 'application/json', 'Authorization': '***'}, 'allow_redirects': False, 'params': None}
[2022-09-21, 13:35:05 UTC] {client.py:177} DEBUG - URL http://openmetadata-server:8585/api/v1/tables/name/Warehouse_Dev.raw_dev.fisk_cloudsql_public.ost, method GET
[2022-09-21, 13:35:05 UTC] {client.py:178} DEBUG - Data {'headers': {'Content-type': 'application/json', 'Authorization': '***'}, 'allow_redirects': False, 'params': None}
[2022-09-21, 13:35:05 UTC] {ometa_api.py:546} DEBUG - Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/metadata/ingestion/ometa/client.py", line 201, in _one_request
    resp.raise_for_status()
  File "/usr/local/lib/python3.9/site-packages/requests/models.py", line 1022, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http://openmetadata-server:8585/api/v1/tables/name/Warehouse_Dev.raw_dev.fisk_cloudsql_public.ost

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/metadata/ingestion/ometa/ometa_api.py", line 539, in _get
    resp = self.client.get(f"{self.get_suffix(entity)}/{path}{fields_str}")
  File "/usr/local/lib/python3.9/site-packages/metadata/ingestion/ometa/client.py", line 232, in get
    return self._request("GET", path, data)
  File "/usr/local/lib/python3.9/site-packages/metadata/ingestion/ometa/client.py", line 179, in _request
    return self._one_request(method, url, opts, retry)
  File "/usr/local/lib/python3.9/site-packages/metadata/ingestion/ometa/client.py", line 209, in _one_request
    raise APIError(error, http_error) from http_error
metadata.ingestion.ometa.client.APIError: table instance for Warehouse_Dev.raw_dev.fisk_cloudsql_public.ost not found

[2022-09-21, 13:35:05 UTC] {ometa_api.py:547} WARNING - GET Table for name/Warehouse_Dev.raw_dev.fisk_cloudsql_public.ost.Error 404 - table instance for Warehouse_Dev.raw_dev.fisk_cloudsql_public.ost not found
[2022-09-21, 13:35:05 UTC] {fivetran.py:171} INFO - Lineage Skipped for fisk_cloudsql.fisk.public.ost - Warehouse_Dev.raw_dev.fisk_cloudsql_public.ost

fredriv avatar Sep 21 '22 14:09 fredriv

@fredriv @ulixius9 we are applying lowercase normalizer in ES to avoid this issue

        "type": "keyword",
        "normalizer": "lowercase_normalizer"
      },```

harshach avatar Sep 21 '22 15:09 harshach

@harshach I believe we are directly calling the backend api instead of querying to ES first here

ulixius9 avatar Sep 21 '22 15:09 ulixius9

I get a 404 when trying to look up the table with the lowercase ID:

curl -i http://localhost:8585/api/v1/tables/name/Warehouse_Dev.raw_dev.fisk_cloudsql_public.ost
HTTP/1.1 404 Not Found
Date: Thu, 22 Sep 2022 07:50:19 GMT
Content-Type: application/json
Content-Length: 100

{"code":404,"message":"table instance for Warehouse_Dev.raw_dev.fisk_cloudsql_public.ost not found"}

But it works when using the uppercase ID:

curl -I http://localhost:8585/api/v1/tables/name/Warehouse_Dev.RAW_DEV.FISK_CLOUDSQL_PUBLIC.OST
HTTP/1.1 200 OK
Date: Thu, 22 Sep 2022 07:53:06 GMT
Content-Type: application/json
Content-Length: 2316

fredriv avatar Sep 22 '22 07:09 fredriv

@harshach I believe we are directly calling the backend api instead of querying to ES first here

@ulixius9 ES gets called in fqn.build to get the FQN name. We then call the API directly, but we should have been able to find the entity through ES first

pmbrull avatar Oct 03 '22 04:10 pmbrull

related to https://github.com/open-metadata/OpenMetadata/issues/7690

pmbrull avatar Nov 28 '22 12:11 pmbrull

I confirmed with the user that this is happening using PostgreSQL as DB for the OM server.

nahuelverdugo avatar Nov 30 '22 07:11 nahuelverdugo

https://github.com/open-metadata/OpenMetadata/pull/9079 has fixed this.

@fredriv, please, reopen the issue if the error still happens on 0.13.1.

nahuelverdugo avatar Dec 12 '22 08:12 nahuelverdugo