feat(hash): add support of hash expiration commands
based on https://github.com/apache/kvrocks/pull/2402
- fixed CommandHStrlen return
- removed keyExpirationCheck in GetMetadata as per @PragmaTwice request
- verified all cmds
@jjz921024 I verified all hash cmds and they seem to work without the additional check in GetMetadata do you happen to remember if there was a specific use case you added it there?
@ltagliamonte-dd Hi. Did you set this config hash-field-expiration to yes or run the unit tests in this branch? If the field has an expiration, we have to check to the field expiration before each read operation.
@ltagliamonte-dd Hi. Did you set this config
hash-field-expirationto yes or run the unit tests in this branch? If the field has an expiration, we have to check to the field expiration before each read operation.
Hello @jjz921024 thanks for getting back to me, yes I did test all hash commands having hash-field-expiration yes.
I plan to give another looks at the code today, i'd love to get in touch with you to better understand the check because from what i could briefly see the key expiration check is also done in each cmds implementations.
I coudn't find your contact neither on the kvrocks slack nor zulipchat.
i did run the unit test with no problem the go integration test only a test case is failing (i plan to look at it today):
--- FAIL: TestHashFieldExpiration (7.66s)
--- FAIL: TestHashFieldExpiration/HFE_check_hash_metadata_after_all_of_fields_expired (1.00s)
hash_test.go:1018:
Error Trace: /Users/luigi.tagliamonte/Projects/doordash/kvrocks/tests/gocase/unit/type/hash/hash_test.go:1018
Error: Not equal:
expected: -2ns
actual : -1ns
Test: TestHashFieldExpiration/HFE_check_hash_metadata_after_all_of_fields_expired
FAIL
exit status 1
FAIL github.com/apache/kvrocks/tests/gocase/unit/type/hash 199.945s
@jjz921024 the CI seems to pass removing the check in GetMetadata and fixing GetTime. Would love to get it touch with you to see if I'm missing something here.
@jjz921024 @ltagliamonte-dd Thanks for your great efforts. I took my first pass in a new PR and generally look in good shape from my perspective. Two nit points can be improved:
- Change
ExistValidFieldtoGetValidFieldCount, so thatHash::Sizecan also resue this function. - Many places decode the hash metadata and check if it has any valid fields. We can avoid duplicate code by adding a dedicated function to this check.
if (metadata.Type() == kRedisHash) {
HashMetadata hash_metadata(false);
s = hash_metadata.Decode(rocksdb::Slice(pin_values[i].data(), pin_values[i].size()));
if (!s.ok()) continue;
redis::Hash hash_db(storage_, namespace_);
if (hash_db.ExistValidField(ctx, slice_keys[i], hash_metadata)) {
@ltagliamonte-dd Hi, I create a PR #1 in your kvrocks fork repo. There are some polish about this feature for your reference.
The purpose of this this commit https://github.com/apache/kvrocks/pull/2402/commits/aac63cc83d0457bc1c02061063975e6a1736b8cc is to avoid unnecessary field expiration checks when executing the hset/hmset command, which comes from https://github.com/apache/kvrocks/pull/2402#discussion_r1894554219 @PragmaTwice suggestion.
The purpose of this this commit https://github.com/apache/kvrocks/pull/2402/commits/73062319eccf80ec06317b3541af8fca3313fd41 is make ExistValidField method more general so that Hash::Size can also resue it. which comes from @git-hulk suggestion.
Hello @jjz921024 thank you for the help on getting this through the finish line. As I looked at the PR https://github.com/ltagliamonte-dd/kvrocks/pull/1 I can see that GetMetadata still contains the expiration check on all fields keys, that @PragmaTwice was pointing out as possible perf problem.
Discussing this check with @PragmaTwice and @git-hulk on zulipchat we decided to move the check up in the call stack and perform it only when required.
As the CI on this PR is showing (and my verification) it looks like there is no need for expiration check in GetMetadata, I would love to hear from you if i'm missing something and why the check is required. Please join us on zulipchat so we can discuss this in a more interactive way.
by removing the check in GetMetadata commit https://github.com/apache/kvrocks/commit/aac63cc83d0457bc1c02061063975e6a1736b8cc is not needed
thank you for commit https://github.com/apache/kvrocks/commit/73062319eccf80ec06317b3541af8fca3313fd41 i will be importing it.
@git-hulk I gave a look at your suggestion to remove duplicated code, it looks like duplicated but if you look closely they are all slightly different checks... we may be able to refactor with something like this:
rocksdb::Status Hash::HandleRedisHash(engine::Context &ctx, const Metadata &metadata, const std::string &value,
const std::string &key, std::function<void(int)> on_valid_field_count);
so we pass an handleFunction into the refactored function Happy to hear if you have better/different ideas about it.
@PragmaTwice @mapleFU @torwig @caipengbo To see if you guys can have a look at this PR. It generally looks good to me. It'd be nice to push this feature forward to avoid pending too long time.
I'll take a look at this weekend if no other things happen : )
cc @mapleFU PTAL.
Also now we need to solve git conflicts before reviewing and merging. cc @ltagliamonte-dd
@PragmaTwice can you please re-trigger the CI it looks like a flaky test
I'm thinking that,
sizein hash metadata will become nearly a useless field while expiration is enabled, so it's a little tricky since we can easily misuse it. cc @git-hulk @mapleFU
Yes, we need to take care of this after introducing the hash expiration feature. But we can try to improve this before formally releasing it.
Quality Gate passed
Issues
40 New issues
0 Accepted issues
Measures
0 Security Hotspots
69.9% Coverage on New Code
12.6% Duplication on New Code
Hello Folks any updates on review/merge this PR? Cc @PragmaTwice @git-hulk ?
Folks do we have any updates on this? @PragmaTwice @git-hulk
Quality Gate passed
Issues
41 New issues
0 Accepted issues
Measures
0 Security Hotspots
69.1% Coverage on New Code
13.9% Duplication on New Code
Hi @ltagliamonte-dd , sorry for late response.
This PR includes massive changes to existing, critical code (the encoding logic of HASH data structure). So it requires lots of time to review to ensure that, not only users that need this feature, but also users that don't care this feature, can have a good and consistent experience while using Kvrocks. Also we need to make sure it doesn't affect old users, and the performance doesn't have a big drop.
I currently don't have much time to complete a full review due to some personal affairs. Not sure if @git-hulk and @mapleFU have time to review. But if I get some time I'll move on for this PR : )
Hi @ltagliamonte-dd , sorry for late response.
This PR includes massive changes to existing, critical code (the encoding logic of HASH data structure). So it requires lots of time to review to ensure that, not only users that need this feature, but also users that don't care this feature, can have a good and consistent experience while using Kvrocks. Also we need to make sure it doesn't affect old users, and the performance doesn't have a big drop.
I currently don't have much time to complete a full review due to some personal affairs. Not sure if @git-hulk and @mapleFU have time to review. But if I get some time I'll move on for this PR : )
From the implementation, it should not affect the user who doesn't use the hash expiration feature. But the performance would be heavily impacted once enabled, because it needs to count the valid field number at each operation. And it would be hard to improve once we have released this, so I have no confidence to say it's fine to have a try from my perspective. But I admit the hash expiration is a key feature for Kvrocks and desire to introduce this feature.
thanks @git-hulk , what do you prefer to do? back to the drawing board? I'm happy to contribute if we find a better way to add this feature.
I'm not sure, but maybe Redis (according to this PR, I'll try to double check) tracks the latest expiration time of the element in the hash so we can save it to Metadata and take it into consideration without checking all valid elements in the hash. So this solution will bring the same performance to both "normal hash" and the "hash with elements that have TTL". @git-hulk @PragmaTwice @mapleFU WDYT?
Any updates or progress on this feature?
need this feature.