Unable to get lineage between Postgres and Snowflake using Fivetran connector
Affected module Backend and ingestion framework
Describe the bug I've ingested tables from Postgres and Snowflake into Open Metadata. Tables are synced between Postgres and Snowflake using Fivetran.
I'm trying to use the new Fivetran connector to set up lineage between the Postgres and Snowflake tables. The connector manages to find the Fivetran pipelines, but it fails to locate the correct Snowflake table in Open Metadata since the Snowflake tables use UPPER CASE for the IDs whereas Fivetran (Postgres?) seems to use lower case. So the connector tries to look up a Snowflake table based on FQN Warehouse_Dev.raw_dev.fisk_cloudsql_public.ost which fails because it should be Warehouse_Dev.RAW_DEV.FISK_CLOUDSQL_PUBLIC.OST
To Reproduce
- Ingest tables from Postgres and Snowflake into Open Metadata - where Snowflake database is populated with Fivetran.
- NB! Make sure to name the Snowflake connector the same as the Fivetran destination (possibly another bug/missing config option?)
- Set up Fivetran connector in Open Metadata to ingest the pipelines (enable debug logs)
- See in the debug logs that Fivetran connector is unable to locate the Snowflake tables based on FQN
Expected behavior
I expected the Fivetran connector to find both the Postgres and Snowflake tables and set up lineage between them.
Version:
- OS: MacOS
- Python version: 3.9.9
- OpenMetadata version: 0.12
- OpenMetadata Ingestion package version:
openmetadata-ingestion[docker]==0.12.0.2
Additional context
Excerpt from the Fivetran connector debug log:
[2022-09-21, 13:35:05 UTC] {client.py:177} DEBUG - URL http://openmetadata-server:8585/api/v1/tables/name/fisk_cloudsql.fisk.public.ost, method GET
[2022-09-21, 13:35:05 UTC] {client.py:178} DEBUG - Data {'headers': {'Content-type': 'application/json', 'Authorization': '***'}, 'allow_redirects': False, 'params': None}
[2022-09-21, 13:35:05 UTC] {client.py:177} DEBUG - URL http://openmetadata-server:8585/api/v1/tables/name/Warehouse_Dev.raw_dev.fisk_cloudsql_public.ost, method GET
[2022-09-21, 13:35:05 UTC] {client.py:178} DEBUG - Data {'headers': {'Content-type': 'application/json', 'Authorization': '***'}, 'allow_redirects': False, 'params': None}
[2022-09-21, 13:35:05 UTC] {ometa_api.py:546} DEBUG - Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/metadata/ingestion/ometa/client.py", line 201, in _one_request
resp.raise_for_status()
File "/usr/local/lib/python3.9/site-packages/requests/models.py", line 1022, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http://openmetadata-server:8585/api/v1/tables/name/Warehouse_Dev.raw_dev.fisk_cloudsql_public.ost
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/metadata/ingestion/ometa/ometa_api.py", line 539, in _get
resp = self.client.get(f"{self.get_suffix(entity)}/{path}{fields_str}")
File "/usr/local/lib/python3.9/site-packages/metadata/ingestion/ometa/client.py", line 232, in get
return self._request("GET", path, data)
File "/usr/local/lib/python3.9/site-packages/metadata/ingestion/ometa/client.py", line 179, in _request
return self._one_request(method, url, opts, retry)
File "/usr/local/lib/python3.9/site-packages/metadata/ingestion/ometa/client.py", line 209, in _one_request
raise APIError(error, http_error) from http_error
metadata.ingestion.ometa.client.APIError: table instance for Warehouse_Dev.raw_dev.fisk_cloudsql_public.ost not found
[2022-09-21, 13:35:05 UTC] {ometa_api.py:547} WARNING - GET Table for name/Warehouse_Dev.raw_dev.fisk_cloudsql_public.ost.Error 404 - table instance for Warehouse_Dev.raw_dev.fisk_cloudsql_public.ost not found
[2022-09-21, 13:35:05 UTC] {fivetran.py:171} INFO - Lineage Skipped for fisk_cloudsql.fisk.public.ost - Warehouse_Dev.raw_dev.fisk_cloudsql_public.ost
@fredriv @ulixius9 we are applying lowercase normalizer in ES to avoid this issue
"type": "keyword",
"normalizer": "lowercase_normalizer"
},```
@harshach I believe we are directly calling the backend api instead of querying to ES first here
I get a 404 when trying to look up the table with the lowercase ID:
curl -i http://localhost:8585/api/v1/tables/name/Warehouse_Dev.raw_dev.fisk_cloudsql_public.ost
HTTP/1.1 404 Not Found
Date: Thu, 22 Sep 2022 07:50:19 GMT
Content-Type: application/json
Content-Length: 100
{"code":404,"message":"table instance for Warehouse_Dev.raw_dev.fisk_cloudsql_public.ost not found"}
But it works when using the uppercase ID:
curl -I http://localhost:8585/api/v1/tables/name/Warehouse_Dev.RAW_DEV.FISK_CLOUDSQL_PUBLIC.OST
HTTP/1.1 200 OK
Date: Thu, 22 Sep 2022 07:53:06 GMT
Content-Type: application/json
Content-Length: 2316
@harshach I believe we are directly calling the backend api instead of querying to ES first here
@ulixius9 ES gets called in fqn.build to get the FQN name. We then call the API directly, but we should have been able to find the entity through ES first
related to https://github.com/open-metadata/OpenMetadata/issues/7690
I confirmed with the user that this is happening using PostgreSQL as DB for the OM server.
https://github.com/open-metadata/OpenMetadata/pull/9079 has fixed this.
@fredriv, please, reopen the issue if the error still happens on 0.13.1.