dataproc-templates icon indicating copy to clipboard operation
dataproc-templates copied to clipboard

[hardening] [java] jdbctobq

Open shashank-google opened this issue 2 years ago • 4 comments

For RDBMS, hardening goal can be 1 TB

shashank-google avatar Jun 10 '22 05:06 shashank-google

In-progress. Will use the JDBC table from issue#148 and use that to test this issue for 1 TB hardening.

abhijat-gupta avatar Dec 05 '22 04:12 abhijat-gupta

I am trying out Spark-bigquery-connector for loading data into Bigquery

abhijat-gupta avatar Dec 08 '22 06:12 abhijat-gupta

Resumed the hardening of this template on 26th Dec. Will continue to test it for the next few days. Trying to close it before new years. thanks.

abhijat-gupta avatar Dec 26 '22 03:12 abhijat-gupta

Scope:

1TB of data (total) Table sizes TBD

Main thing we want to observe: How does Spark template work . Spark template reads from a single table via JDBC and writes to a single BQ table. Single job will spin into multiple threads.

Databases including: MySQL, Postgres, MSSQL

Table type(s): heaps, clustered index, unique index

Data Types: Various common datatypes, including those with known issues, such as datetime

mattfsmith avatar Feb 13 '23 21:02 mattfsmith