gazelle_plugin
gazelle_plugin copied to clipboard
Add a strategy to fall back to Vanilla Spark shuffle manager
What changes were proposed in this pull request?
Add the strategy to fallback to Vanilla Spark shuffle manager. o Enable fallback shuffle configuration and reuse the ColumnarShuffleExchangeExec o Initiate the splitter iterator in Shuffle Dependency, and transform to the RDD: Produce2[Int, ColumnarBatch] o Serialize the record batch to Shuffle Writer of Vanilla Spark.
How does this patch work?
When submit an application, we use native SQL engine with default ColumnarShuffleManager configuration,
--conf spark.shuffle.manager=org.apache.spark.shuffle.sort.ColumnarShuffleManager
However, we want to specify the custom or other shuffle manager for some situations, to enable Vanilla Spark shuffle manager,
--conf spark.shuffle.manager=org.apache.spark.shuffle.sort.SortShuffleManager --conf spark.oap.sql.columnar.enableFallbackShuffle=true
How was this patch tested?
(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
Thanks for opening a pull request!
Could you open an issue for this pull request on Github Issues?
https://github.com/oap-project/native-sql-engine/issues
Then could you also rename commit message and pull request title in the following format?
[NSE-${ISSUES_ID}] ${detailed message}
See also:
Assuming the first two commits are not relevant to your patch, please do NOT include them. If your work depends on these commits, it would be better to open a dedicate PR to port them to main branch.