redis Use metadata to handle the reference relationship between kvstore and dict.

PR #12822 refactors the database, and part of it involved splitting the rehashing list from the global level back to the database level, or more specifically, the kvstore level. This change is fine, and it also simplifies the process of swapping databases, which is good. And it should not have a major impact on the efficiency of incremental rehashing.

To implement the kvstore-level rehashing list, each dict under the kvstore needs to know which kvstore it belongs. However, #12822 did not insert the reference relationship into the dict itself, instead, it placed it in the dictType. In my view, this is a somewhat odd way. Theoretically, dictType is just a collection of function handles, a kind of virtual type that can be referenced globally, not an entity. But #12822 instantiated the dictType, with each kvstore owning an actual dictType, which in turn holds a reverse reference to the kvstore's resource pointer. This design is somewhat uncomfortable for me.

I think the dictType should not be instantiated. The references between actual resources (kvstore and dict) should occur between specific objects, rather than force materializing the dictType, which is supposed to be virtual. I've tried some modifications, the code currently is not perfect, we can discuss this further.

Feb 19 '24 09:02 soloestoy

we did that trick on purpose (it was my design) to avoid the extra memory consumption. in the current code we added just one more kvstore pointer per kvstore (by duplicating / hacking the dictType). in your code you're adding another pointer per dict, i wanted to avoid that.

Feb 19 '24 12:02 oranagra

Using appropriate trick for performance optimization or space optimization is acceptable, but the premise is that it should not break the original design. For example, no_value and keys_are_odd are also very tricky methods, but they do not violate the design pattern of dictType.

In my understanding, dictType is similar to the Singleton pattern, where only one instance of a class of dictType should exist and can be shared by all users. Therefore, it should not create other copies, nor should it hold resources from other objects.

Regarding memory optimization, the current tricky method actually may increases memory usage because each kvstore has to hold a dictType object, needing a copy and thus using extra memory, we can examine the specific situation.

The size of dictType is 96 bytes. In standalone mode, each kvstore has only one dict, so putting the kvstore pointer in the dict's metadata increases memory by 8 bytes, but not needing a copy of dictType reduces memory by 96 bytes.

In cluster mode, in the worst-case scenario, a kvstore can have up to 16,384 dicts, and putting the kvstore pointer in the dict's metadata increases memory by 8 * 16,384 bytes, but reduces the 96 bytes needed for a copy of dictType. However, if there are multiple nodes, then the number of dicts held by each node's kvstore will decrease. In the extreme case, for example with 16,384 nodes, each node's kvstore would only have one dict, which then actually saves more memory (the kvstore's dicts are created as needed, so dicts for slots not belonging to the node are NULL), the kvstore pointer added 8 * 16,384 bytes, but eliminating the dictType copy saved 96 * 16,384 bytes.

Feb 20 '24 02:02 soloestoy

for the record, I agree with @soloestoy

In the worst case case (cluster mode, one node) we consume an extra 131kb In an average case (cluster mode, 10 nodes) we consume an extra 13kb per node

in this case, I think we need to lean towards better software design over the (relatively small amounts) of extra memory we pay for it

Feb 20 '24 03:02 guybe7

the worse case is not really a realistic one. and the extra 96 bytes for non-cluster mode isn't a real concern either.

the bottom line is that we're considering wasting 8 bytes per dict, vs 96 bytes per kvstore, so it means that on any cluster node with more than 12 slots is better off with the current committed design, right?

correct me if i'm wrong the common case is probably far more than 12 slots per node, right?

Feb 20 '24 07:02 oranagra

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

Mar 24 '24 23:03 CLAassistant

redis redis copied to clipboard

Use metadata to handle the reference relationship between kvstore and dict.

redis
redis copied to clipboard