Integrate hdfs services with `hdfs-native`
https://github.com/Kimahriman/hdfs-native is a pure Rust implementation of an HDFS client. It might be worth considering integrating it as a new service in opendal, allowing our users to give it a try.
I am looking for basic hdfs operations read, write and delete in my app.
The current hdfs-sys based approach is costing in terms of image size due to the hadoop jars which need to be pulled in. If this is a pure rust implementation then it could be worth investing.
I can take this up as we already use OpenDAL in our application, and having support for hdfs-native will reduce need for an additional layer in my app.
Please let me know if this is up for grab and I can spend time on it.
Thank you!
Hi, @shbhmrzd, I'm waiting for https://github.com/Kimahriman/hdfs-native/issues/13.
Sure @Xuanwo I would like to pick up OpenDAL integration when it is planned :)
Sure @Xuanwo I would like to pick up OpenDAL integration when it is planned :)
Thanks! I will ping you here when it's ready.
hdfs-native 0.5 has been released, let's do this!
Awesome Can I take it up ?
Awesome Can I take it up ?
Welcome! Have fun. We can add the layout first and fill the implementaion one by one.
Thanks to @shbhmrzd and @jihuayu's efforts. We have the basic feature set now. Maybe it's time for us to establish the behavior test before adding more features.
@Xuanwo Hi, I want to add the behavior tests for it. Is there some information about how to add the behavior test?
Is there some information about how to add the behavior test?
Our behavior test exists at https://github.com/apache/opendal/tree/main/core/tests/behavior
We can follow the same content from hdfs for our hdfs native tests:
- fixtures: we use those fixtures to setup our services, https://github.com/apache/opendal/tree/main/fixtures/hdfs
- setup: we have a automate workflow for all services. We just need to add new setup for
hdfs_nativelike https://github.com/apache/opendal/tree/main/.github/services/hdfs/hdfs_default
The test should be run automanticly in our CI.
Hi, @jihuayu, do you need some help from me?
@Xuanwo Thank you! I did encounter some trouble: When I was preparing to write test cases, I found that the reader and writer for hdfs_native were not implemented, so I decided to implement them first. Since I'm not very familiar with asynchronous Rust, my code has not been functioning properly. Could you help me take a look at how I should write it? https://github.com/jihuayu/opendal/blob/f/hdfs-test/core/src/services/hdfs_native/writer.rs#L43 After running it, I found that the write never stops.
After running it, I found that the write never stops.
FileWrtier::write is an async function, every call to f.write() will create a new future. So you will need to store this future and poll it until Ready.
For example:
https://github.com/apache/opendal/blob/ab52b437fdb19b7fbaa4be850e00c84f360d06f1/core/src/raw/oio/read/range_read.rs#L51-L56
@Xuanwo Ohhh! I know! Thank you. I will have a try!
Hi, @jihuayu, I did a refactor to the whole opendal's IO trait. Would you like to take another try?
@Xuanwo Thank you. I love the new trait. I've been quite busy lately, and I'll be back in a few months.
@Xuanwo Can I take a swing at implementing the read and write? Thank you!
Can I take a swing at implementing the read and write?
Of course, have fun!