Shashank Agarwal

Results 22 issues of Shashank Agarwal

Update Spanner templates for GoogleSQL vs PG Interface

bug
documentation

JDBCToJDBC template need to allow list of primary key column names when tables are auto generated. Example config from GCSToSpanner template https://github.com/GoogleCloudPlatform/dataproc-templates/blob/main/java/src/main/java/com/google/cloud/dataproc/templates/gcs/GCSToSpanner.java#L70 Also update the corresponding Notebooks (like MSSQLToPostgres)

enhancement
python
high-priority

For RDBMS, hardening goal can be 1 TB

good first issue
hardening
java

For RDBMS, hardening goal can be 100GB

good first issue
hardening
java

Test if following scenarios work with GCSToSpanner template. You may use any of the template (JDBCToGCS, BQToGCS etc) to generate test data in GCS. 1. Test for appending data into...

test-case

Document to describe how to debug and/or scale templates.

good first issue
publishing

All notebooks have hardcoded working directory. It needs to be dynamically computed as user can potentially checkout in a different sub-directory.

bug
Notebook

Test with 5 TB of data in source Make changes to template code as necessary to make it work

good first issue
hardening
java

good first issue
phase-2
hardening
java

https://cloud.google.com/dataproc-serverless/docs/concepts/versions/dataproc-serverless-versions#supported-dataproc-serverless-for-spark-runtime-versions Upgrade Dataproc Serverless runtime version from 1.1 to 1.2 Needs to be done for both Java and Python.

high-priority
dependencies