kvrocks icon indicating copy to clipboard operation
kvrocks copied to clipboard

feat(hash): add per-field expiration support (HEXPIRE, HTTL, HPERSIST)

Open tejaslodayadd opened this issue 1 month ago • 8 comments

Implements per-field expiration support for HASH data type, following Redis 7.4 specification.

Commands:

HEXPIRE key seconds FIELDS numfields field [field ...]
HTTL key FIELDS numfields field [field ...]
HPERSIST key FIELDS numfields field [field ...]

Implementation Details

  • Uses backward-compatible flag-based encoding for field values
  • Legacy format (no expiration): [raw value]
  • New format (with expiration): [0xFF][flags][8-byte timestamp][value]
  • Existing data works without migration
  • Expired fields are filtered out on read operations (HGET, HMGET, HGETALL, etc.)
  • Compaction filter cleans up expired fields with 5-minute lazy delete buffer

This extends the existing key-level expiration (HSETEXPIRE) added in #2750 to support per-field granularity, which is a common feature request similar to Redis 7.4's hash field expiration support.

Reference: https://redis.io/docs/latest/develop/data-types/hashes/

tejaslodayadd avatar Nov 24 '25 21:11 tejaslodayadd

@torwig could you please check this PR, as I remember, we talk about this feature? Thanks

aleksraiden avatar Nov 24 '25 22:11 aleksraiden

@tejaslodayadd lot ot thanks, cool feature, we need support of this

aleksraiden avatar Nov 24 '25 22:11 aleksraiden

After discussing this internally, there's a problem:

Currently, HLEN returns the field count by reading metadata.size directly - an O(1) operation. However, with per-field expiration:

  • Expired fields remain in storage until compaction cleans them up
  • The metadata.size counter doesn't know when individual fields expire
  • Result: A hash with 5 fields where 1 has expired will still report HLEN = 5 instead of 4

Options:

  1. Accept approximate count: Keep current O(1) behavior, document that HLEN returns approximate count when using field TTL, accurate after compaction (eventual consistent)
  2. Scan-based count: HLEN scans all fields and counts non-expired ones. Con: O(n) instead of O(1), expensive for large hashes
  3. Optimized scan with metadata tracking: Add a flag in HashMetadata to track if any field has expiration set. HLEN returns metadata.size directly if no fields have TTL -- O(1), and only scan when hash has fields with expiration -- O(n)

Let us know which way you'd want us to move forward @PragmaTwice @torwig @aleksraiden

tejaslodayadd avatar Nov 25 '25 01:11 tejaslodayadd

@tejaslodayadd Thanks for your contribution.

I just had a glance at this PR and have two questions about the design:

  • When would the size in metadata be updated? It seems the size won't be corrected after the compaction.
  • Should we also take care of the HSCAN command? Let's filter out the expired fields.

From my personal perspective, it's acceptable to sacrifice the short-term correctness of a command like HLEN to improve performance.

git-hulk avatar Nov 25 '25 03:11 git-hulk

Hi, thank you for your contribution. I skimmed through and found some issue in this design:

  • "New format (with expiration): [0xFF][flags][8-byte timestamp][value]". Hmm the hash field value can actually be binary, so even in the old encoding, the first byte of the value CAN BE 0xff. How to distinguish them? I think the answer is negative.
  • -2 = field doesn't exist, 1 = expiration set, 0 = expiration not set (e.g., invalid expire time). Usually this comment indicates that you should use enum classes instead of integers.
  • the size field in metatdata seems not well mantained, in compaction filters.
  • it seems HashFieldValue can be a view (zero-copy)?
  • The code seems vibe-coded. It is fine but there are lots of useless code comment and the PR author should understand details of the llm-written code (so that the review process can be meaningful).

PragmaTwice avatar Nov 25 '25 04:11 PragmaTwice

@git-hulk @PragmaTwice FYI I'm going to help @tejaslodayadd with this PR. I've addressed few of the comments you folks left.

I'm currently debating how to proceed with the compaction size/problem. during compaction of the default CF we can't update the metadata CF (where the hash size is), so i think we could go for a lazy repair pattern:

  • during compactions if a field is expired we just update it's value to a STUB value (0xFE) so we immediately reclaim the disk space
  • during read operations like HGET, HGETALL, HSCAN we do a synchronous/asynchronous Repair:
    • Create an Atomic WriteBatch.
    • Delete the key from defaultCF.
    • Update/Merge the count in metadataCF.
    • Commit.

What do you folks think about this approach?

ltagliamonte-dd avatar Dec 02 '25 22:12 ltagliamonte-dd

when it comes to:

i've implemented a new format that uses a two-byte magic marker (0xFF 0xFE) instead of a single byte, let me know if this works for you @PragmaTwice

New format: [[0xFF][0xFE][1-byte flags][8-byte timestamp if flag set][value]] Old format: [[0xFF][1-byte flags][8-byte timestamp if flag set][value]]

ltagliamonte-dd avatar Dec 02 '25 23:12 ltagliamonte-dd

@PragmaTwice @git-hulk Just pushed the lazy repair approach discussed before, looking forward to your feedbacks! Thanks!

ltagliamonte-dd avatar Dec 05 '25 21:12 ltagliamonte-dd