feat(hash): add per-field expiration support (HEXPIRE, HTTL, HPERSIST)
Implements per-field expiration support for HASH data type, following Redis 7.4 specification.
Commands:
HEXPIRE key seconds FIELDS numfields field [field ...]
HTTL key FIELDS numfields field [field ...]
HPERSIST key FIELDS numfields field [field ...]
Implementation Details
- Uses backward-compatible flag-based encoding for field values
- Legacy format (no expiration):
[raw value] - New format (with expiration):
[0xFF][flags][8-byte timestamp][value] - Existing data works without migration
- Expired fields are filtered out on read operations (HGET, HMGET, HGETALL, etc.)
- Compaction filter cleans up expired fields with 5-minute lazy delete buffer
This extends the existing key-level expiration (HSETEXPIRE) added in #2750 to support per-field granularity, which is a common feature request similar to Redis 7.4's hash field expiration support.
Reference: https://redis.io/docs/latest/develop/data-types/hashes/
@torwig could you please check this PR, as I remember, we talk about this feature? Thanks
@tejaslodayadd lot ot thanks, cool feature, we need support of this
After discussing this internally, there's a problem:
Currently, HLEN returns the field count by reading metadata.size directly - an O(1) operation. However, with per-field expiration:
- Expired fields remain in storage until compaction cleans them up
- The
metadata.sizecounter doesn't know when individual fields expire - Result: A hash with 5 fields where 1 has expired will still report
HLEN = 5instead of4
Options:
- Accept approximate count: Keep current O(1) behavior, document that
HLENreturns approximate count when using field TTL, accurate after compaction (eventual consistent) - Scan-based count:
HLENscans all fields and counts non-expired ones. Con: O(n) instead of O(1), expensive for large hashes - Optimized scan with metadata tracking: Add a flag in
HashMetadatato track if any field has expiration set.HLENreturnsmetadata.sizedirectly if no fields have TTL -- O(1), and only scan when hash has fields with expiration -- O(n)
Let us know which way you'd want us to move forward @PragmaTwice @torwig @aleksraiden
@tejaslodayadd Thanks for your contribution.
I just had a glance at this PR and have two questions about the design:
- When would the size in metadata be updated? It seems the size won't be corrected after the compaction.
- Should we also take care of the HSCAN command? Let's filter out the expired fields.
From my personal perspective, it's acceptable to sacrifice the short-term correctness of a command like HLEN to improve performance.
Hi, thank you for your contribution. I skimmed through and found some issue in this design:
- "New format (with expiration): [0xFF][flags][8-byte timestamp][value]". Hmm the hash field value can actually be binary, so even in the old encoding, the first byte of the value CAN BE
0xff. How to distinguish them? I think the answer is negative. -2 = field doesn't exist, 1 = expiration set, 0 = expiration not set (e.g., invalid expire time). Usually this comment indicates that you should use enum classes instead of integers.- the
sizefield in metatdata seems not well mantained, in compaction filters. - it seems
HashFieldValuecan be a view (zero-copy)? - The code seems vibe-coded. It is fine but there are lots of useless code comment and the PR author should understand details of the llm-written code (so that the review process can be meaningful).
@git-hulk @PragmaTwice FYI I'm going to help @tejaslodayadd with this PR. I've addressed few of the comments you folks left.
I'm currently debating how to proceed with the compaction size/problem. during compaction of the default CF we can't update the metadata CF (where the hash size is), so i think we could go for a lazy repair pattern:
- during compactions if a field is expired we just update it's value to a STUB value (0xFE) so we immediately reclaim the disk space
- during read operations like HGET, HGETALL, HSCAN we do a synchronous/asynchronous Repair:
- Create an Atomic WriteBatch.
- Delete the key from defaultCF.
- Update/Merge the count in metadataCF.
- Commit.
What do you folks think about this approach?
when it comes to:
i've implemented a new format that uses a two-byte magic marker (0xFF 0xFE) instead of a single byte, let me know if this works for you @PragmaTwice
New format: [[0xFF][0xFE][1-byte flags][8-byte timestamp if flag set][value]] Old format: [[0xFF][1-byte flags][8-byte timestamp if flag set][value]]
@PragmaTwice @git-hulk Just pushed the lazy repair approach discussed before, looking forward to your feedbacks! Thanks!