HDFS support in iceberg-rust
There are several ways to access HDFS in rust:
- libhdfs + JNI: needs java runtime and libhdfs
- WebHDFS: http based, may have performance issues
- HDFS Native: rust native but not well adopted.
OpenDAL supports all these methods. However, from Iceberg's perspective, it's better to expose a single hdfs storage option and allow users to choose the method via feature flags to avoid confusion. This is because we write service names in our metadata files, and all implementations must be able to read hdfs://host:port/files.
So my current plan is to introduce webhdfs and hdfs_native support in iceberg-rust. They will hide under storage-webhdfs and storage-hdfs-native flag:
Users can choose which implementations to use:
fn parse_scheme(scheme: &str) -> crate::Result<Scheme> {
match scheme {
"memory" => Ok(Scheme::Memory),
"file" | "" => Ok(Scheme::Fs),
"s3" | "s3a" => Ok(Scheme::S3),
"gs" | "gcs" => Ok(Scheme::Gcs),
#[cfg(feature = "storage-webhdfs")]
"hdfs" => Ok(Scheme::Webhdfs),
#[cfg(feature = "storage-hdfs-native")]
"hdfs" => Ok(Scheme::HdfsNative),
s => Ok(s.parse::<Scheme>()?),
}
}
See https://github.com/apache/iceberg-rust/pull/1131 for more details about this design.
cc @liurenjie1024, let's discuss here.
I'm also fine to add a new services called webhdfs here. But the location returned by hive will still be hdfs://host:port/abc. It will be a bit confusing.
cc @liurenjie1024, let's discuss here.
I'm also fine to add a new services called
webhdfshere. But the location returned by hive will still behdfs://host:port/abc. It will be a bit confusing.
Do you mean that even if user is using webhdfs, hive metastore will still return hdfs path?
Do you mean that even if user is using webhdfs, hive metastore will still return hdfs path?
Yep. There is no iceberg impls will write webhdfs://xxx.
Do you mean that even if user is using webhdfs, hive metastore will still return hdfs path?
Yep. There is no iceberg impls will write
webhdfs://xxx.
Seems there are several problems needs to resolve here:
- Should we add webhdfs support as feature? It's mainly compile time support or runtime support for webhdfs
- Should we treat webhdfs/hdfs as same storage or different service?
Is there other problems to discuss?
- It's mainly compile time support or runtime support for webhdfs
What's the difference about this?
- It's mainly compile time support or runtime support for webhdfs
What's the difference about this?
Oh, I was thinking about treating hdfs/webhdfs as same storage service. I took a look your pr, if they are treated as different stoarge, then there is difference.
Oh, I was thinking about treating hdfs/webhdfs as same storage service. I took a look your pr, if they are treated as different stoarge, then there is difference.
I see, you mean that we have a single Storage Hdfs but provides a config like use_webhdfs for users to switch?
Oh, I was thinking about treating hdfs/webhdfs as same storage service. I took a look your pr, if they are treated as different stoarge, then there is difference.
I see, you mean that we have a single Storage
Hdfsbut provides a config likeuse_webhdfsfor users to switch?
Yes, but I'm not sure it's the best approach.
I see. We have three approachs so far:
- Support
storage-webhdfsandstorage-hdfsbut only exposes ashdfsto users
Users are required to use feature flags for configuration.
- Have only
storage-hdfsbut allow users to pick up impls likewebhdfsorhdfsat runtime by config.
Provides more control for users, but it may be a bit confusing when feature flags don't match the configuration.
- Exposes
storage-webhdfsandstorage-hdfsseperately
It doesn't work for webhdfs since all tables will use hdfs://abc.
Have you considered including libhdfs + JNI in this design? Are we intentionally leaving it out for now, or could it be explored as a separate approach?
I was wondering if the viewfs:// scheme might be supported in this context? I’ve tested it with OpenDAL and confirmed that it’s already supported there.
Have you considered including libhdfs + JNI in this design? Are we intentionally leaving it out for now, or could it be explored as a separate approach?
Yep, I intentionally leaving it out for now since I don't want to play with java runtime for now. And I'm guessing adding it doesn't need another approach. We can just follow the same way to handle it.
I was wondering if the viewfs:// scheme might be supported in this context? I’ve tested it with OpenDAL and confirmed that it’s already supported there.
Oh, that's something new to me. We just need tp map viewfs to hdfs, right?
Have you considered including libhdfs + JNI in this design? Are we intentionally leaving it out for now, or could it be explored as a separate approach?
Yep, I intentionally leaving it out for now since I don't want to play with java runtime for now. And I'm guessing adding it doesn't need another approach. We can just follow the same way to handle it.
I actually tested accessing HDFS using libhdfs + JNI myself, and it worked! The main issue I ran into was linking the JVM library. It’d be awesome if support for this could be added here eventually.
I was wondering if the viewfs:// scheme might be supported in this context? I’ve tested it with OpenDAL and confirmed that it’s already supported there.
Oh, that's something new to me. We just need tp map
viewfstohdfs, right?
To clarify, I ran my test using libhdfs + JNI, and I’m not entirely sure if viewfs is supported in the other two approaches.
In my experience with libhdfs + JNI, viewfs is handled the same way as hdfs, with libhdfs taking care of the differences under the hood. That said, it seems like we’d only need to explicitly pass the scheme to OpenDAL for it to work smoothly.
Sorry that I had to leave early today at the sync, it was a great discussion. Both in Java and PyIceberg there is a FileIO property that you can set to force a specific FileIO: https://py.iceberg.apache.org/configuration/#fileio. This is slightly different since all the implementations come from OpenDAL, but might be an option to consider.
Per yesterday's discussion, we reached consensus on following points:
- We should treat webhdfs as a storage different hdfs given their configuration would be different.
- We should include both libhdfs and hdfsrs in iceberg-rust and allow user to choose at runtime. Libhdfs is still much more mature than hdfsrs so in serious hadoop deployment it's still useful. Hdfsrs allows user to run without relying on, so we think it's still valueful in some use cases.
AFAIK, webhdfs doesn't support HDFS NameNode HA. When fail over happens, it won't switch to the new active NameNode automatically.
Does iceberg rust support hadoop catalog?
I only see s3tables in the code base.
Does iceberg rust support hadoop catalog? I only see
s3tablesin the code base.
Hi, @mingmwang There is no support for hadoop catalog, but there is StaticTable which allows you to query tables without relying on a catalog.
🔄 HDFS支持实施状态更新 (2025年9月12日)
✅ 技术分析完成
经过全面分析,确认以下技术要点:
-
OpenDAL 0.54.0 完全支持:确认支持三种HDFS服务
Hdfs(libhdfs + JNI)WebHdfs(HTTP REST API)HdfsNative(纯Rust实现)
-
存储架构已就绪:当前Storage枚举架构完全支持扩展HDFS变体
- 现有模式:Memory、LocalFs、S3、GCS、OSS、Azdls
- 可直接添加三个HDFS变体,遵循相同的配置和操作符创建模式
-
社区共识明确:基于2025年3月讨论达成的一致方向
- 三个独立Storage变体
- 运行时选择机制(通过feature flags)
- 支持viewfs://协议映射
📋 实施计划
优先级顺序:WebHdfs > Hdfs > HdfsNative
技术实施步骤:
- 添加feature flags:
storage-hdfs、storage-webhdfs、storage-hdfs-native - 在Storage枚举中添加三个HDFS变体
- 实现配置解析和操作符创建函数
- 在parse_scheme中添加协议映射(hdfs://、webhdfs://、viewfs://)
- 设计运行时选择机制
⚠️ 关键限制说明
- WebHDFS NameNode HA限制:如@manuzhang指出,WebHDFS不支持自动故障切换,需在文档中突出说明
- libhdfs运行时要求:需要Java环境支持
🔍 PR #1450 状态确认
经检查,PR #1450 当前状态:
- 状态:CONFLICTING(合并冲突)
- 最后更新:2025年6月18日
- 技术方向:专注于Hadoop catalog实现,而非Storage层面
该PR方向与当前Storage层面需求不同,需要与@awol2005ex进一步沟通确认后续处理方案。
🎯 下一步行动
准备开始具体实施,将基于现有Storage架构模式直接实现HDFS支持,无需等待存在冲突的PR。
估计实施时间:2-3周完成核心功能
@Xuanwo Are you doing some experiment with AI summary?
OpenDAL的HDFS写路径问题挺多的
而且OpenDAL套路有点深,问题很难triage
@Xuanwo Are you doing some experiment with AI summary?
Oh, sorry about that. My AI summary bot accidentally posted the comments publicly.