dataproc-templates icon indicating copy to clipboard operation
dataproc-templates copied to clipboard

Dataproc templates and pipelines for solving simple in-cloud data tasks

Results 104 dataproc-templates issues
Sort by recently updated
recently updated
newest added

ref. [[Spike] Explore the usage of PubSub Lite Spark Connector for Python template](https://github.com/GoogleCloudPlatform/dataproc-templates/issues/594)

new-template
python

Enable **mssql-to-postgres-notebook** to be executed from the command line. 1. Create a script like [this](https://github.com/GoogleCloudPlatform/dataproc-templates/blob/main/notebooks/mysql2spanner/MySqlToSpanner_parameterize_script.py) to execute the notebook with command line arguments. 2. Update the notebook to use `IS_PARAMETERIZED`...

Notebook

Enhance readme files with common issues and solutions for them that we are seeing and fixing for users

documentation
enhancement

As we control parallelism via `numPartitions` we do not want Oracle to then run Spark queries in parallel. It would be really easy to overload a source system if this...

enhancement
python
java

Enable **OracleToSpanner_notebook** to be executed from the command line. 1. Create a script like [this](https://github.com/GoogleCloudPlatform/dataproc-templates/blob/main/notebooks/mysql2spanner/MySqlToSpanner_parameterize_script.py) to execute the notebook with command line arguments. 2. Update the notebook to use `IS_PARAMETERIZED`...

Notebook

Update Spanner templates for GoogleSQL vs PG Interface

bug
documentation