kvrocks icon indicating copy to clipboard operation
kvrocks copied to clipboard

feat(hash): add support of hash expiration commands

Open ltagliamonte-dd opened this issue 11 months ago • 21 comments

based on https://github.com/apache/kvrocks/pull/2402

  • fixed CommandHStrlen return
  • removed keyExpirationCheck in GetMetadata as per @PragmaTwice request
  • verified all cmds

@jjz921024 I verified all hash cmds and they seem to work without the additional check in GetMetadata do you happen to remember if there was a specific use case you added it there?

ltagliamonte-dd avatar Jan 09 '25 00:01 ltagliamonte-dd

@ltagliamonte-dd Hi. Did you set this config hash-field-expiration to yes or run the unit tests in this branch? If the field has an expiration, we have to check to the field expiration before each read operation.

jjz921024 avatar Jan 09 '25 12:01 jjz921024

@ltagliamonte-dd Hi. Did you set this config hash-field-expiration to yes or run the unit tests in this branch? If the field has an expiration, we have to check to the field expiration before each read operation.

Hello @jjz921024 thanks for getting back to me, yes I did test all hash commands having hash-field-expiration yes. I plan to give another looks at the code today, i'd love to get in touch with you to better understand the check because from what i could briefly see the key expiration check is also done in each cmds implementations. I coudn't find your contact neither on the kvrocks slack nor zulipchat. i did run the unit test with no problem the go integration test only a test case is failing (i plan to look at it today):

--- FAIL: TestHashFieldExpiration (7.66s)
    --- FAIL: TestHashFieldExpiration/HFE_check_hash_metadata_after_all_of_fields_expired (1.00s)
        hash_test.go:1018: 
            	Error Trace:	/Users/luigi.tagliamonte/Projects/doordash/kvrocks/tests/gocase/unit/type/hash/hash_test.go:1018
            	Error:      	Not equal: 
            	            	expected: -2ns
            	            	actual  : -1ns
            	Test:       	TestHashFieldExpiration/HFE_check_hash_metadata_after_all_of_fields_expired
FAIL
exit status 1
FAIL	github.com/apache/kvrocks/tests/gocase/unit/type/hash	199.945s

ltagliamonte-dd avatar Jan 09 '25 20:01 ltagliamonte-dd

@jjz921024 the CI seems to pass removing the check in GetMetadata and fixing GetTime. Would love to get it touch with you to see if I'm missing something here.

ltagliamonte avatar Jan 10 '25 03:01 ltagliamonte

@jjz921024 @ltagliamonte-dd Thanks for your great efforts. I took my first pass in a new PR and generally look in good shape from my perspective. Two nit points can be improved:

  1. Change ExistValidField to GetValidFieldCount, so that Hash::Size can also resue this function.
  2. Many places decode the hash metadata and check if it has any valid fields. We can avoid duplicate code by adding a dedicated function to this check.
    if (metadata.Type() == kRedisHash) {
      HashMetadata hash_metadata(false);
      s = hash_metadata.Decode(rocksdb::Slice(pin_values[i].data(), pin_values[i].size()));
      if (!s.ok()) continue;
      redis::Hash hash_db(storage_, namespace_);
      if (hash_db.ExistValidField(ctx, slice_keys[i], hash_metadata)) {

git-hulk avatar Jan 11 '25 04:01 git-hulk

@ltagliamonte-dd Hi, I create a PR #1 in your kvrocks fork repo. There are some polish about this feature for your reference.

The purpose of this this commit https://github.com/apache/kvrocks/pull/2402/commits/aac63cc83d0457bc1c02061063975e6a1736b8cc is to avoid unnecessary field expiration checks when executing the hset/hmset command, which comes from https://github.com/apache/kvrocks/pull/2402#discussion_r1894554219 @PragmaTwice suggestion.

The purpose of this this commit https://github.com/apache/kvrocks/pull/2402/commits/73062319eccf80ec06317b3541af8fca3313fd41 is make ExistValidField method more general so that Hash::Size can also resue it. which comes from @git-hulk suggestion.

jjz921024 avatar Jan 12 '25 12:01 jjz921024

Hello @jjz921024 thank you for the help on getting this through the finish line. As I looked at the PR https://github.com/ltagliamonte-dd/kvrocks/pull/1 I can see that GetMetadata still contains the expiration check on all fields keys, that @PragmaTwice was pointing out as possible perf problem.

Discussing this check with @PragmaTwice and @git-hulk on zulipchat we decided to move the check up in the call stack and perform it only when required.

As the CI on this PR is showing (and my verification) it looks like there is no need for expiration check in GetMetadata, I would love to hear from you if i'm missing something and why the check is required. Please join us on zulipchat so we can discuss this in a more interactive way.

by removing the check in GetMetadata commit https://github.com/apache/kvrocks/commit/aac63cc83d0457bc1c02061063975e6a1736b8cc is not needed

thank you for commit https://github.com/apache/kvrocks/commit/73062319eccf80ec06317b3541af8fca3313fd41 i will be importing it.

ltagliamonte-dd avatar Jan 13 '25 17:01 ltagliamonte-dd

@git-hulk I gave a look at your suggestion to remove duplicated code, it looks like duplicated but if you look closely they are all slightly different checks... we may be able to refactor with something like this:

rocksdb::Status Hash::HandleRedisHash(engine::Context &ctx, const Metadata &metadata, const std::string &value,
                                  const std::string &key, std::function<void(int)> on_valid_field_count);

so we pass an handleFunction into the refactored function Happy to hear if you have better/different ideas about it.

ltagliamonte-dd avatar Jan 13 '25 23:01 ltagliamonte-dd

@PragmaTwice @mapleFU @torwig @caipengbo To see if you guys can have a look at this PR. It generally looks good to me. It'd be nice to push this feature forward to avoid pending too long time.

git-hulk avatar Jan 15 '25 02:01 git-hulk

I'll take a look at this weekend if no other things happen : )

cc @mapleFU PTAL.

Also now we need to solve git conflicts before reviewing and merging. cc @ltagliamonte-dd

PragmaTwice avatar Jan 22 '25 03:01 PragmaTwice

@PragmaTwice can you please re-trigger the CI it looks like a flaky test

ltagliamonte-dd avatar Jan 22 '25 10:01 ltagliamonte-dd

I'm thinking that, size in hash metadata will become nearly a useless field while expiration is enabled, so it's a little tricky since we can easily misuse it. cc @git-hulk @mapleFU

Yes, we need to take care of this after introducing the hash expiration feature. But we can try to improve this before formally releasing it.

git-hulk avatar Jan 25 '25 09:01 git-hulk

Hello Folks any updates on review/merge this PR? Cc @PragmaTwice @git-hulk ?

ltagliamonte-dd avatar Feb 10 '25 22:02 ltagliamonte-dd

Folks do we have any updates on this? @PragmaTwice @git-hulk

ltagliamonte-dd avatar Mar 04 '25 19:03 ltagliamonte-dd

Hi @ltagliamonte-dd , sorry for late response.

This PR includes massive changes to existing, critical code (the encoding logic of HASH data structure). So it requires lots of time to review to ensure that, not only users that need this feature, but also users that don't care this feature, can have a good and consistent experience while using Kvrocks. Also we need to make sure it doesn't affect old users, and the performance doesn't have a big drop.

I currently don't have much time to complete a full review due to some personal affairs. Not sure if @git-hulk and @mapleFU have time to review. But if I get some time I'll move on for this PR : )

PragmaTwice avatar Mar 05 '25 02:03 PragmaTwice

Hi @ltagliamonte-dd , sorry for late response.

This PR includes massive changes to existing, critical code (the encoding logic of HASH data structure). So it requires lots of time to review to ensure that, not only users that need this feature, but also users that don't care this feature, can have a good and consistent experience while using Kvrocks. Also we need to make sure it doesn't affect old users, and the performance doesn't have a big drop.

I currently don't have much time to complete a full review due to some personal affairs. Not sure if @git-hulk and @mapleFU have time to review. But if I get some time I'll move on for this PR : )

From the implementation, it should not affect the user who doesn't use the hash expiration feature. But the performance would be heavily impacted once enabled, because it needs to count the valid field number at each operation. And it would be hard to improve once we have released this, so I have no confidence to say it's fine to have a try from my perspective. But I admit the hash expiration is a key feature for Kvrocks and desire to introduce this feature.

git-hulk avatar Mar 05 '25 08:03 git-hulk

thanks @git-hulk , what do you prefer to do? back to the drawing board? I'm happy to contribute if we find a better way to add this feature.

ltagliamonte avatar Mar 21 '25 13:03 ltagliamonte

I'm not sure, but maybe Redis (according to this PR, I'll try to double check) tracks the latest expiration time of the element in the hash so we can save it to Metadata and take it into consideration without checking all valid elements in the hash. So this solution will bring the same performance to both "normal hash" and the "hash with elements that have TTL". @git-hulk @PragmaTwice @mapleFU WDYT?

torwig avatar Mar 21 '25 14:03 torwig

Any updates or progress on this feature?

salarali avatar May 30 '25 19:05 salarali

need this feature.

filefox avatar Sep 09 '25 03:09 filefox