spark
spark copied to clipboard
[SPARK-33274][SS] Stop query in cp mode when total cores less than total kafka partition
What changes were proposed in this pull request? Add check for total executor cores when SetReaderPartitions message received.
Why are the changes needed? In continuous processing mode, EpochCoordinator won't add offsets to query until got ReportPartitionOffset from all partitions. Normally, each kafka topic partition will be handled by one core, if total cores is smaller than total kafka topic partition counts, the job will hang without any error message.
Does this PR introduce any user-facing change?
Yes, if total executor cores is smaller than total kafka partition count, the exception with below error will be thrown:
Total $numReaderPartitions (kafka partitions) * $cpusPerTask (cpus per task) = $neededCores needed, but only have min($numExecutors (executors) * $coresPerExecutor (cores per executor) = $totalRequestedCores, spark.cores.max = $maxCores) = $totalAvailableCores (total cores). Please increase total number of executor cores to at least $neededCores.
How was this patch tested? Added test in EpochCoordinatorSuite
Can one of the admins verify this patch?
sorry, didn't mean to approve.
@HeartSaVioR @xuanyuanking @gaborgsomogyi @viirya Could you help take a look?
@Ngone51 @viirya @HeartSaVioR @xuanyuanking @gaborgsomogyi Could you help take a look at this?
@Ngone51 @viirya @HeartSaVioR @xuanyuanking @gaborgsomogyi Could you help take a look at this?
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!