gazelle_plugin icon indicating copy to clipboard operation
gazelle_plugin copied to clipboard

Add a strategy to fall back to Vanilla Spark shuffle manager

Open lviiii opened this issue 2 years ago • 2 comments

What changes were proposed in this pull request?

Add the strategy to fallback to Vanilla Spark shuffle manager. o Enable fallback shuffle configuration and reuse the ColumnarShuffleExchangeExec o Initiate the splitter iterator in Shuffle Dependency, and transform to the RDD: Produce2[Int, ColumnarBatch] o Serialize the record batch to Shuffle Writer of Vanilla Spark.

How does this patch work?

When submit an application, we use native SQL engine with default ColumnarShuffleManager configuration, --conf spark.shuffle.manager=org.apache.spark.shuffle.sort.ColumnarShuffleManager

However, we want to specify the custom or other shuffle manager for some situations, to enable Vanilla Spark shuffle manager, --conf spark.shuffle.manager=org.apache.spark.shuffle.sort.SortShuffleManager --conf spark.oap.sql.columnar.enableFallbackShuffle=true

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

lviiii avatar Jul 25 '22 10:07 lviiii

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/oap-project/native-sql-engine/issues

Then could you also rename commit message and pull request title in the following format?

[NSE-${ISSUES_ID}] ${detailed message}

See also:

github-actions[bot] avatar Jul 25 '22 10:07 github-actions[bot]

Assuming the first two commits are not relevant to your patch, please do NOT include them. If your work depends on these commits, it would be better to open a dedicate PR to port them to main branch.

PHILO-HE avatar Jul 26 '22 07:07 PHILO-HE