bigquery-data-lineage icon indicating copy to clipboard operation
bigquery-data-lineage copied to clipboard

Support Lineage through BigQuery View

Open anantdamle opened this issue 5 years ago • 1 comments

Organizations use Views for abstracting actual tables or for access control.

To build a complete Lineage relationship, it is important to identify lineage for the view by recursively computing lineage all the way back till Source tables.

anantdamle avatar Jul 28 '20 23:07 anantdamle

Possibly related (if so, let me know and I'll kill the issue I just submitted):

If I work through the tutorial at https://cloud.google.com/architecture/building-a-bigquery-data-lineage-solution, once the Dataflow job starts, it immediately starts throwing the below errors. It looks like it's expecting to see only tables. But we (and I presume thousands of others) have many views. I understand that this enhancement isn't complete yet but shouldn't the current version just ignore views if it's not meant to parse them?

Error message from worker: java.lang.IllegalArgumentException: Table Type should be "TABLE" found "VIEW" com.google.cloud.solutions.datalineage.converter.BigQuerySchemaConverter.convert(BigQuerySchemaConverter.java:50) java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) java.base/java.util.Iterator.forEachRemaining(Iterator.java:133) java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578) com.google.cloud.solutions.datalineage.service.BigQueryZetaSqlSchemaLoader.loadSchemas(BigQueryZetaSqlSchemaLoader.java:41) com.google.cloud.solutions.datalineage.service.BigQueryZetaSqlSchemaLoader.loadSchemas(BigQueryZetaSqlSchemaLoader.java:46) com.google.cloud.solutions.datalineage.BigQuerySqlParser.buildCatalogWithQueryTables(BigQuerySqlParser.java:141) com.google.cloud.solutions.datalineage.BigQuerySqlParser.resolve(BigQuerySqlParser.java:119) com.google.cloud.solutions.datalineage.BigQuerySqlParser.extractColumnLineage(BigQuerySqlParser.java:67) com.google.cloud.solutions.datalineage.extractor.QueryJobExtractor.extractColumnLineage(QueryJobExtractor.java:102) com.google.cloud.solutions.datalineage.extractor.QueryJobExtractor.extract(QueryJobExtractor.java:86) com.google.cloud.solutions.datalineage.extractor.InsertJobTableLineageExtractor.extract(InsertJobTableLineageExtractor.java:59) com.google.cloud.solutions.datalineage.transform.LineageExtractionTransform$IdentifyAndExtract.extract(LineageExtractionTransform.java:84)

brad-meru avatar Jan 28 '22 16:01 brad-meru