spark-atlas-connector icon indicating copy to clipboard operation
spark-atlas-connector copied to clipboard

Temporary tables stored in Atlas

Open whazor opened this issue 5 years ago • 1 comments

Currently the Spark Atlas Connector reports temporary thrift tables to Atlas as spark_table. Below you can find an example lineage report. The questions we have about these temporary tables:

  • Why does the spark-atlas-connector report temporary tables to Atlas in the first place?
  • If there is a good reason to have the temporary tables reported:
    • why are the entities not reported as deleted?
    • why is there not an attribute describing that the table is deleted?
    • why is the table type MANAGED? as it is temporary

Example lineage reported:

createTime: 1553897909000
database: DBNAME
description: [empty]
lastAccessTime: 0
name: o_TABLENAME_xref_20190328
owner: [owner of task]
paritionColumnNames: [empty]
properties: transient_lastDdlTime: 1553897909, bucketing_version: 2
provider: parquet
qualifiedName: thrift://node1:9083,thrift://node2:9083,thrift://node3:9083.DBNAME.o_TABLENAME_xref_20190328
schema: [empty]
storage: thrift://node1:9083,thrift://node2:9083,thrift://node3:9083.DBNAME.o_TABLENAME_xref_20190328.storageFormat
tableType: MANAGED
unsupportedFeatures: [empty]

whazor avatar May 29 '19 08:05 whazor

Thanks for reporting! It would be pretty helpful if you contain step to reproduce too, as well as which branch/commit do you use to reproduce issue.

SAC is a kind of "moving one" and we haven't plan on official releases: so if it doesn't reproduce in current master, we may not address it to previous version/branch.

Thanks again!

HeartSaVioR avatar May 29 '19 08:05 HeartSaVioR