datahub
datahub copied to clipboard
Tableau datasources duplicating lineages with uppercase databases/table names
Describe the bug When metadata is ingested from Tableau the database level entity does not merge with metadata ingested from the database itself if the database name (and presumably table) is uppercase.
To Reproduce Steps to reproduce the behavior:
- Create a Postgresql database with a fully uppercase name ex. DATABASE
- Use that database in a Tableau dashboard as a live datasource
- Ingest metadata from Tableau and Postgresql
- View lineages
Expected behavior Database name is imported with the expected case.
Screenshots
Two datasets with the "same" name.
Additional context urn for the above objects:
Tableau ->> urn:li:dataset:(urn:li:dataPlatform:postgres,nova.public.fact_enrolments_plus,PROD)
Postgres ->> urn:li:dataset:(urn:li:dataPlatform:postgres,NOVA.public.fact_enrolments_plus,PROD)
Confirmed in the Tableau metadata api that the name is as expected:
{
"id": "c0a4dfbc-1193-6a2f-a91a-55c950cae875",
"name": "NOVA",
"connectionType": "postgres"
}
Originally I thought this might be an issue with the Tableau ingest changing the uppercase names to lowercase, however after a conversation with @shirshanka on slack, it looks like this is an issue with the postgres ingest not dropping things down to lowercase.
I saw that there's a environment variable that can be set as a workaround, DATAHUB_DATASET_URN_TO_LOWER and it looks like there is an intention of everything going to lowercase but some things may be left out. I don't see a issue related to this atm, so I'll leave this open, but feel free to close if this is the appropriate workaround for the time being.
Thanks!
Hello, I have a similar problem with Oracle instead of Postgres: the Tableau import uses names in the form "host:port.schema.table" for the Oracle sources :
Import Oracle : urn:li:dataPlatform:oracle,qual_don_prop.workflows Import Tableau : urn:li:dataPlatform:oracle,zinnt:1525.qual_don_prop.workflows
This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io
This issue was closed because it has been inactive for 30 days since being marked as stale.