spark-rapids
spark-rapids copied to clipboard
[FEA] Enhance profiler recommendations for key settings for performance tuning
I wish the RAPIDS Accelerator for Apache Spark profiling tool would recommend optimal settings to improve application performance. Key settings that would be relevant are:
- [ ] spark.sql.files.maxPartitionBytes: test out your latest changes with more benchmarks, also try to separate out CPU/GPU task sizes
- [ ] spark.sql.shuffle.partitions: look into more granular spill metrics for determining heuristic for increasing/decreasing
- [ ] spark.rapids.sql.batchSizeBytes: look into more granular spill metrics for determining heuristic for increasing/decreasing
- [ ] spark.rapids.sql.reader.batchSizeBytes: look into more granular spill metrics for determining heuristic for increasing/decreasing