spark-rapids
spark-rapids copied to clipboard
[FEA] Set default Spark and Spark RAPIDS settings for Databricks
I wish whenever using Databricks + Spark RAPIDS, users do not need to specify any extra Spark or Spark RAPIDS config in the "Spark config" section.
Could we set below settings to the default value for Databricks? Such as:
spark.plugins com.nvidia.spark.SQLPlugin
spark.task.resource.gpu.amount (1/spark.executor.cores)
spark.databricks.delta.optimizeWrite.enabled false
spark.sql.adaptive.enabled false
spark.sql.optimizer.dynamicPartitionPruning.enabled false
spark.sql.legacy.parquet.datetimeRebaseModeInWrite EXCEPTION
spark.sql.legacy.parquet.int96RebaseModeInWrite EXCEPTION
spark.rapids.memory.pinnedPool.size 2G
spark.sql.files.maxPartitionBytes 2048m
spark.rapids.sql.concurrentGpuTasks 2
spark.locality.wait 0s
Of course, we need to document the default settings for Databricks so that users is aware of those changes.