spark-rapids
spark-rapids copied to clipboard
Use alluxio native API to mount instead of cmd
Currently, we use an Alluxio command line to get the mount table and mount an s3 bucket. This PR introduced the Alluxio client jar to do:
- Get the mount table
- mount an s3 bucket
Signed-off-by: Chong Gao [email protected]
Tested in the databricks docker, but not tested the Kratos Env which requires no access and secret keys.
so can you add to description why we are doing this? Is this just cleanup to get rid of call out of cli? Is the client pretty compatible with multiple versions or do we know if they break compatibility a lot?
Tested compatible: Compile spark rapids against Alluxio version 2.8.0 and the server is 2.8.0 version. Compile spark rapids against Alluxio version 2.8.0 and the server is 2.8.1 version. Compile spark rapids against Alluxio version 2.8.1 and the server is 2.8.1 version.
Building is blocked by: https://github.com/NVIDIA/spark-rapids/issues/6869
build
build
build
build
build
Rebased the code, and re-tested on the databricks cluster.
Fixed the NoClassDefFoundError
when running UT, see: https://github.com/NVIDIA/spark-rapids/pull/6824/commits/b9d7278c98f65bce1a59c228ddbb435b09386620
@tgravescs Help review, thanks.
build