kvrocks icon indicating copy to clipboard operation
kvrocks copied to clipboard

Add test cases for the data encoding compatibility

Open git-hulk opened this issue 1 year ago • 7 comments

Search before asking

  • [X] I had searched in the issues and found no similar issues.

Motivation

As @PragmaTwice mentioned in an offline topic(@DenizPiri also mentioned in https://github.com/apache/incubator-kvrocks/issues/414#issuecomment-1432883367.), we didn't have any test cases to ensure the data encoding is compatible with the old version, which is very important for a storage service.

Solution

We can create a minimized dataset for all types per release(starting from 2.3.0), then load and check the dataset is ok to load.

Are you willing to submit a PR?

  • [ ] I'm willing to submit a PR!

git-hulk avatar Apr 05 '23 16:04 git-hulk

@git-hulk I see this issue in the roadmap, and as it seems that no one is working on it (because this issue is troublesome and dirty...?), could you assign it to me?

Currently I only have some unclear thoughts on this topic. I think to check the data encoding compatibility, we are supposed to:

  1. Write the raw data in different version's encoding (use the RocksDB api directly and bypass the kvrocks api);
  2. Then read the data with the current version's kvrocks api to check if the data is readable and the data semantic is correct.

But still, I am not sure of the proper workflow, or how far we should go. Maybe you could kindly refer me to some existing projects' "data encoding checking" mechanism/code to shed me some light, if you happen to know. I would try to find some material myself as well.

HolyLow avatar Sep 25 '23 01:09 HolyLow

@git-hulk I see this issue in the roadmap, and as it seems that no one is working on it (because this issue is troublesome and dirty...?), could you assign it to me?

Currently I only have some unclear thoughts on this topic. I think to check the data encoding compatibility, we are supposed to:

  1. Write the raw data in different version's encoding (use the RocksDB api directly and bypass the kvrocks api);
  2. Then read the data with the current version's kvrocks api to check if the data is readable and the data semantic is correct.

But still, I am not sure of the proper workflow, or how far we should go. Maybe you could kindly refer me to some existing projects' "data encoding checking" mechanism/code to shed me some light, if you happen to know. I would try to find some material myself as well.

Hi @HolyLow Assigned, thanks for your interest.

Let me think a bit about this, I will ping you back soon.

git-hulk avatar Sep 25 '23 01:09 git-hulk

@git-hulk Hello, do you have any thoughts yet?

HolyLow avatar Dec 22 '23 09:12 HolyLow

@HolyLow Thanks for your followup, I forgot about this issue. You can continue moving on if you have any ideas to achieve this.

git-hulk avatar Dec 23 '23 07:12 git-hulk

I think maybe you can add a new job in the nightly CI workflow, and cache the binary of an old version of kvrocks, e.g. 2.6.0.

PragmaTwice avatar Dec 23 '23 08:12 PragmaTwice

I'm wondering if it's good to maintain a docker image for testing this.

git-hulk avatar Dec 23 '23 08:12 git-hulk

I'm wondering if it's good to maintain a docker image for testing this.

Ahh good idea, we can just fetch docker images of old version kvrocks rather than cache the binary in GitHub actions.

PragmaTwice avatar Dec 23 '23 10:12 PragmaTwice