Add spark350emr shim layer [EMR]
This PR targets to add a new shim layer spark350emr which supports running Spark RAPIDS on AWS EMR Spark 3.5.0.
Note, this PR is a new revision of previous PR rebased on branch-24.04. You can find more details about testing in that PR.
Somewhere in ./jenkins/version-def.sh or in .github/workflows/mvn-verify-check.yml we need to exclude 350emr from package-tests matrix on GH hosted PR checks because we won't have access to 3.5.0-amzn-0 dependencies presumably.
Error: Failed to execute goal on project rapids-4-spark-emr-bom: Could not resolve dependencies for project com.nvidia:rapids-4-spark-emr-bom:pom:24.04.0-SNAPSHOT: The following artifacts could not be resolved: org.apache.spark:spark-sql_2.12:jar:3.5.0-amzn-0, org.apache.spark:spark-hive_2.12:jar:3.5.0-amzn-0: org.apache.spark:spark-sql_2.12:jar:3.5.0-amzn-0 was not found in https://repo1.maven.org/maven2 during a previous attempt. This failure was cached in the local repository and resolution is not reattempted until the update interval of central has elapsed or updates are forced -> [Help 1]
We will also need to produce a spark-rapids-private shim for 350emr
cc @GaryShen2008 @sameerz
Somewhere in ./jenkins/version-def.sh or in .github/workflows/mvn-verify-check.yml we need to exclude 350emr from package-tests matrix on GH hosted PR checks because we won't have access to 3.5.0-amzn-0 dependencies presumably.
Error: Failed to execute goal on project rapids-4-spark-emr-bom: Could not resolve dependencies for project com.nvidia:rapids-4-spark-emr-bom:pom:24.04.0-SNAPSHOT: The following artifacts could not be resolved: org.apache.spark:spark-sql_2.12:jar:3.5.0-amzn-0, org.apache.spark:spark-hive_2.12:jar:3.5.0-amzn-0: org.apache.spark:spark-sql_2.12:jar:3.5.0-amzn-0 was not found in https://repo1.maven.org/maven2 during a previous attempt. This failure was cached in the local repository and resolution is not reattempted until the update interval of central has elapsed or updates are forced -> [Help 1]We will also need to produce a spark-rapids-private shim for 350emr
cc @GaryShen2008 @sameerz
EMR pre-merge/nightly build&test will be like what we've done for Databricks runtime.
We'll need separated CI jobs running on EMR
Retarget to branch-24.06 for next release, as we're running v24.04 release, please let me know if you've any concern, thanks!