opendal icon indicating copy to clipboard operation
opendal copied to clipboard

OpenKAL: Open Key Access Layer

Open Xuanwo opened this issue 1 year ago • 8 comments

We previously discussed this at https://github.com/apache/opendal/discussions/1796. I'm still unsure if it's a good idea, but I think we should at least initiate a discussion on it.

I've seen use cases where opendal is used for key-based access through simple read/write/delete operations, which could easily be mapped to get/set/del. By offering openkal, their use cases could be greatly simplified.

What do you think about this?


Challenges I can foresee:

  • Transaction
  • Scan
  • Pipeline/Batch

Xuanwo avatar Feb 18 '24 04:02 Xuanwo

https://github.com/datenlord/datenlord/blob/master/src/async_fuse/memfs/kv_engine/mod.rs I have also done similar work, and found that get/set/del/range fully meet our requirements. As for Transactions, we abstracted a middle representation based on the read-modify-write model, which includes get/set/del/commit. This makes the transaction logic much simpler. I'm really looking forward to this feature.

xiaguan avatar Feb 18 '24 05:02 xiaguan

Actually, I don't like this idea; It would mix the boundaries of the project.

For now, we can assume the data we need to store in the KV is all binary-like data(blob, pure binary, or something else). It fully match the goal of the object.

If we would like to openkal for the KV storage, we may need take care of a lot of type for different kv system(set, datetime, text etc...). It's ok to open a new project for openkal but it would be a disaster mix it in the opendal

Zheaoli avatar Feb 18 '24 07:02 Zheaoli

If we would like to openkal for the KV storage, we may need take care of a lot of type for different kv system(set, datetime, text etc...). It's ok to open a new project for openkal but it would be a disaster mix it in the opendal

To clarify, this idea continues to work with binary-like data but through a more key-value-like API. Users want set/datetime/geo should go to use sdk directly.

Xuanwo avatar Feb 18 '24 08:02 Xuanwo

To clarify, this idea continues to work with binary-like data but through a more key-value-like API. Users want set/datetime/geo should go to use sdk directly.

Got it, But I still think it would be terrible to split the API style

Zheaoli avatar Feb 18 '24 16:02 Zheaoli

Got it, But I still think it would be terrible to split the API style

The existence of openkal should have no effect on opendal users. In fact, users can wrap get/set/del on opendal directly.

Can you elaborate more background for your feelings?

Xuanwo avatar Feb 18 '24 17:02 Xuanwo

The existence of openkal should have no effect on opendal users. In fact, users can wrap get/set/del on opendal directly.

Do you mean OpenKAL is just a kv wrapper around OpenDAL?

By the way, are we going to store kv data on backend like s3,hdfs etc? This feels a bit strange

jihuayu avatar Feb 19 '24 00:02 jihuayu

Do you mean OpenKAL is just a kv wrapper around OpenDAL?

The point is that openkal is not included in the opendal rust core. How it's implemented is a separate issue.

By the way, are we going to store kv data on backend like s3,hdfs etc? This feels a bit strange

No, I'm not suggesting or planning users to store key-value data on backends like S3. I'm simply discussing the possibility of empowering users to access key-value data on ANY storage service.

Some users utilize opendal to access data similar to key-value pairs.

let op: opendal::Operator;
let bs = op.read(path).await?;
let my_type: MyType = serde_json::from_slice(&bs)?;

Maybe it could be fun to have:

let op: openkal::Operator;

let bs: Vec<u8> = op.get(path).await?;
let my_type: MyType = op.get_json(path).await?;

Xuanwo avatar Feb 19 '24 02:02 Xuanwo

Short note: Maybe the Entry API is worth to have a look at for this, as it is rusty and people are easily able to get into that.

simonsan avatar Feb 22 '24 15:02 simonsan