Anant Damle
Anant Damle
This is a co-authored tutorial with @mrpaulthomas
[COLLECTIONS-795] Add a new Iterator to allowing zipping over two iterators of different types.
use Apache Commons-CSV for csv parsing removes hardcoding of options to pipeline options with defaults.
As part of unit testing, it becomes important to test the Logged messages in case of exceptions or other such scenarios. Is this the recommended way to test logging behaviour...
Organizations use [Views](https://cloud.google.com/bigquery/docs/views-intro) for abstracting actual tables or for access control. To build a complete Lineage relationship, it is important to identify lineage for the view by recursively computing lineage...
The core ZetaSQL parser allows single Statement queries (with support for `WITH` clauses) Need to support complex [BigQuery Scripts](https://cloud.google.com/bigquery/docs/reference/standard-sql/scripting)
BigQueryTableCreator throws `IllegalArgumentException` when trying to generate intermediate table models for `WITH` statement
Data Entities are encapsulated as two models [DataEntity](https://github.com/GoogleCloudPlatform/bigquery-data-lineage/blob/85a38594c6deae1f727ae069428601f780cdf917/src/main/proto/lineage_messages.proto#L30), * [BigQueryTableEntity](https://github.com/GoogleCloudPlatform/bigquery-data-lineage/blob/85a38594c6deae1f727ae069428601f780cdf917/src/main/java/com/google/cloud/solutions/datalineage/model/BigQueryTableEntity.java#L28) * [CloudStorageFile](https://github.com/GoogleCloudPlatform/bigquery-data-lineage/blob/85a38594c6deae1f727ae069428601f780cdf917/src/main/java/com/google/cloud/solutions/datalineage/model/CloudStorageFile.java#L34) Need to make is simplified with all models move to Proto messages for standardization.
BigQuery supports writing Dataflow SQL statements to be executed using Dataflow jobs. Parse Dataflow SQL for batch Jobs to identify lineage and tag them.