spark-rapids
spark-rapids copied to clipboard
Adding AutoTuner to Profiling Tool
Fixes #6300
This is a fork from previous #6301 which failed the CI. The following changes were applied to the original PR:
- Fix broken test AnalysisSuite
- Remove AutoTunerSuite (tests will be added in the followup #6334)
- Fix a divide-by-zero in the AutoTuner
- Upmerge with 22.10
Feature Description:
Added an AutoTuner module that uses heuristics based techniques to recommend Spark configurations for users to run Spark on RAPIDS.
Usage
java -cp rapids-4-spark-tools_2.12-<version>.jar:$SPARK_HOME/jars/*
com.nvidia.spark.rapids.tool.profiling.ProfileMain [existing options] --auto-tuner --worker-info <system properties yaml> <eventlogs>
Example
System Properties -
system:
num_cores: 64
cpu_arch: x86_64
memory: 512gb
free_disk_space: 800gb
time_zone: America/Los_Angeles
num_workers: 4
gpu:
count: 8
memory: 32gb
name: NVIDIA V100
Generated Recommendation -
Spark Properties:
--conf spark.executor.cores=8
--conf spark.executor.instances=32
--conf spark.executor.memory=63.75g
--conf spark.executor.memoryOverhead=8.38g
--conf spark.rapids.memory.pinnedPool.size=2g
--conf spark.rapids.sql.concurrentGpuTasks=4
--conf spark.sql.files.maxPartitionBytes=4g
--conf spark.sql.shuffle.partitions=2000
--conf spark.task.resource.gpu.amount=0.125
build