incubator-gluten
incubator-gluten copied to clipboard
HDFS Support
The current version of velox seems only support S3, and does not support HDFS. So, does gluten support HDFS path? or leverage fused HDFS as an alternative?
Velox supports hive.
Velox has added hdfs read support with below commit: https://github.com/facebookincubator/velox/commit/1fc1fec67cd3e1373da1cf03a18a21496382d8c3. @zhouyuan has a PR working in progress to integrate it with gluten. Maybe yuan can help clarify a bit.
Relevant to https://github.com/oap-project/gluten/issues/388
@rui-mo Can the master branch support hdfs parquet?
@mimitoling There are two pending PRs on hdfs support, PR#388 and PR#152. Some verification and testing work are still WIP.
Velox already has libhdfs support, but the problem is that the we either needs to put all dependency libraries in jar or we need to manually install them on each worker node. Currently we have to create the script.