datahub icon indicating copy to clipboard operation
datahub copied to clipboard

In the case of temp table, the redshift Lineage cannot be obtained accurately.

Open yingyingqiqi opened this issue 2 years ago • 2 comments

Describe the bug In the case of temp table, the redshift Lineage cannot be obtained accurately.

To Reproduce

DROP TABLE IF EXISTS tmp_table;
create temp table tmp_table as
select
a,
b
from tableAA

insert into tableBB select * from tmp_table

tableBB Lineage,not exist tmp_table、tableAA

Expected behavior tableBB Lineage,exist tmp_table、tableAA

yingyingqiqi avatar May 11 '22 03:05 yingyingqiqi

https://github.com/datahub-project/datahub/blob/075d19ef166177ececfbb39796de4721bdde9dc1/metadata-ingestion/src/datahub/ingestion/source/sql/redshift.py#L798-L853

The possible reason is that SVV_TABLE_INFO Do not permanently save temporary table information,Only including temporary tables created by a user for the current session. I can't find all the places where the temporary tables are redshift stored, and I may need to parse the consanguinity through ddl.

yingyingqiqi avatar May 13 '22 03:05 yingyingqiqi

Currently, we drop tables from a lineage that does not exist anymore. Possible solution can be to resolve those connections where A -> TempB -> C to A -> C if TempB does not exists anymore.

treff7es avatar Aug 24 '22 14:08 treff7es

This was fixed by https://github.com/datahub-project/datahub/pull/9704

hsheth2 avatar Feb 12 '24 20:02 hsheth2