incubator-gluten icon indicating copy to clipboard operation
incubator-gluten copied to clipboard

HDFS Support

Open frankliee opened this issue 2 years ago • 2 comments

The current version of velox seems only support S3, and does not support HDFS. So, does gluten support HDFS path? or leverage fused HDFS as an alternative?

frankliee avatar May 16 '22 13:05 frankliee

Velox supports hive.

zhanglistar avatar Jul 06 '22 03:07 zhanglistar

Velox has added hdfs read support with below commit: https://github.com/facebookincubator/velox/commit/1fc1fec67cd3e1373da1cf03a18a21496382d8c3. @zhouyuan has a PR working in progress to integrate it with gluten. Maybe yuan can help clarify a bit.

rui-mo avatar Jul 20 '22 11:07 rui-mo

Relevant to https://github.com/oap-project/gluten/issues/388

jinchengchenghh avatar Sep 23 '22 08:09 jinchengchenghh

@rui-mo Can the master branch support hdfs parquet?

mimitoling avatar Sep 29 '22 12:09 mimitoling

@mimitoling There are two pending PRs on hdfs support, PR#388 and PR#152. Some verification and testing work are still WIP.

rui-mo avatar Sep 29 '22 12:09 rui-mo

Velox already has libhdfs support, but the problem is that the we either needs to put all dependency libraries in jar or we need to manually install them on each worker node. Currently we have to create the script.

FelixYBW avatar Oct 06 '22 19:10 FelixYBW