spark-rapids icon indicating copy to clipboard operation
spark-rapids copied to clipboard

Use alluxio native API to mount instead of cmd

Open res-life opened this issue 2 years ago • 2 comments

Currently, we use an Alluxio command line to get the mount table and mount an s3 bucket. This PR introduced the Alluxio client jar to do:

  • Get the mount table
  • mount an s3 bucket

Signed-off-by: Chong Gao [email protected]

res-life avatar Oct 17 '22 09:10 res-life

Tested in the databricks docker, but not tested the Kratos Env which requires no access and secret keys.

res-life avatar Oct 18 '22 11:10 res-life

so can you add to description why we are doing this? Is this just cleanup to get rid of call out of cli? Is the client pretty compatible with multiple versions or do we know if they break compatibility a lot?

tgravescs avatar Oct 18 '22 13:10 tgravescs

Tested compatible: Compile spark rapids against Alluxio version 2.8.0 and the server is 2.8.0 version. Compile spark rapids against Alluxio version 2.8.0 and the server is 2.8.1 version. Compile spark rapids against Alluxio version 2.8.1 and the server is 2.8.1 version.

res-life avatar Oct 20 '22 11:10 res-life

Building is blocked by: https://github.com/NVIDIA/spark-rapids/issues/6869

res-life avatar Oct 20 '22 11:10 res-life

build

tgravescs avatar Oct 20 '22 13:10 tgravescs

build

res-life avatar Oct 21 '22 05:10 res-life

build

res-life avatar Oct 21 '22 07:10 res-life

build

res-life avatar Oct 21 '22 07:10 res-life

build

res-life avatar Oct 21 '22 09:10 res-life

Rebased the code, and re-tested on the databricks cluster. Fixed the NoClassDefFoundError when running UT, see: https://github.com/NVIDIA/spark-rapids/pull/6824/commits/b9d7278c98f65bce1a59c228ddbb435b09386620

res-life avatar Oct 21 '22 12:10 res-life

@tgravescs Help review, thanks.

res-life avatar Oct 25 '22 03:10 res-life

build

res-life avatar Oct 27 '22 06:10 res-life