dataproc-templates
dataproc-templates copied to clipboard
[hardening] [java] jdbctobq
For RDBMS, hardening goal can be 1 TB
In-progress. Will use the JDBC table from issue#148 and use that to test this issue for 1 TB hardening.
I am trying out Spark-bigquery-connector for loading data into Bigquery
Resumed the hardening of this template on 26th Dec. Will continue to test it for the next few days. Trying to close it before new years. thanks.
Scope:
1TB of data (total) Table sizes TBD
Main thing we want to observe: How does Spark template work . Spark template reads from a single table via JDBC and writes to a single BQ table. Single job will spin into multiple threads.
Databases including: MySQL, Postgres, MSSQL
Table type(s): heaps, clustered index, unique index
Data Types: Various common datatypes, including those with known issues, such as datetime