valkey
valkey copied to clipboard
[Improvement][Cluster Mode] Remove Unowned Keys After Loading Persistence Files At Server Startup
Describe the Issue While the server starts with cluster mode and loading persistence files, keys may not be served by this server due to slot coverage of multiple shards. This happens when:
- slot migration occurred after persistence files generation, server restarts before new persistence files generated and loads the outdated persistence files.
- server startups using pre-generated persistence files for cache warmup, boot up new cluster using existing data, etc. In this case, slot distributions among multiple shards might be different with persistence files.
In above cases, key belongs to a slot might not be served by a server, but that key will be added into this server in current code path. This issue leads to orphaned keys that leaks memory.
We can add a deletion call to clean up unowned keys after the loading, but before the server servers requests.
I can make a PR for this if this issue makes sense.
This makes sense to me. We talked about fixing this in the past but never did anything about it, so I think a contribution is welcome since this is practically a bugfix.
Sounds good, I can make a PR soon.
With the latest RDB file, we have the slot op code, we can use that to skip through the slot/keys which isn't owned by the node. This would ideally be faster.
Wouldn't this cause some inconsistency if some keys of the rdb file is loaded and some keys are discarded? For a cache it may be fine but for a database it may be a problem? Or do we care about consistency only per slot?
Wouldn't this cause some inconsistency if some keys of the rdb file is loaded and some keys are discarded? For a cache it may be fine but for a database it may be a problem? Or do we care about consistency only per slot?
The problem is we end up with keys today that are fully inaccessible. If you wedge a key into a slot you do not own, you can't coherently access it or delete. it.
Or do we care about consistency only per slot?
That has been our historic criteria, we are only consistent within a slot. We could consider changing it.
Fine, I like the idea then.
(I've been annoyed by DEBUG POPULATE too, that it adds keys in non-owned slots. I wish we can fix that too in some other PR