valkey icon indicating copy to clipboard operation
valkey copied to clipboard

allow shrinking hashtables in low memory situations

Open Fusl opened this issue 7 months ago • 1 comments

During memory eviction, we may end up with a primary kv hash table that's below the 13% MIN_FILL_PERCENT_SOFT, but inside resize() for the kv table we check with ht->type->resizeAllowed whether allocating a new table crosses the configured maxmemory limit without taking into account that shrinking, while temporarily increasing the memory usage, will ultimately reduce the memory usage in all situations.

Blocking hash tables from shrinking in situations where we would only temporarily cross the maxmemory limit has at least three negative side effects:

  • The performance of valkey is negatively impacted if, for example, we're moving items from table A to table B since table A would eventually end up being a very sparse hash table that never shrinks.
  • In extreme situations, the performance of valkey slows down significantly when trying to find more keys to evict since we're trying to find keys in a sparse table that only becomes more sparse over time.
  • In more extreme situations, such as when reducing maxmemory by around 90% or writing large data to a single key, valkey can end up in a situation where, no matter how many keys are being evicted from the kv table, the table bucket struct overhead uses more memory than the data stores in the table, causing all keys to be evicted from the table and valkey becoming unusable before setting maxmemory to above the current memory usage:
127.0.0.1:6379> config set maxmemory 2gb
OK
127.0.0.1:6379> debug populate 20000000
OK
(8.94s)
127.0.0.1:6379> debug htstats 0
[Dictionary HT]
Hash table 0 stats (main hash table):
 table size: 29360128
 number of entries: 20000000
[Expires HT]
127.0.0.1:6379> config set maxmemory 200mb
OK
127.0.0.1:6379> dbsize
(integer) 18725794
[...]
127.0.0.1:6379> dbsize
(integer) 0
127.0.0.1:6379> keys *
(empty array)
127.0.0.1:6379> set foo bar
(error) OOM command not allowed when used memory > 'maxmemory'.
127.0.0.1:6379> config get maxmemory
1) "maxmemory"
2) "209715200"
127.0.0.1:6379> info memory
# Memory
used_memory:269890904
used_memory_human:257.39M
[...]
127.0.0.1:6379> config set maxmemory 300mb
OK
127.0.0.1:6379> info memory
# Memory
used_memory:1455512
used_memory_human:1.39M
[...]
127.0.0.1:6379> set foo bar
OK

This PR addresses these issues by allowing hash tables to allocate a new table for shrinking, even if allocating a new table causes the maxmemory limit to be crossed during resize.

Additionally, the check for whether we should fast-forward the resize operation has been moved to before checking whether we are allowed to allocate a new resize table. Because we don't start a resize in this section of the function yet, a fast-forward usually drops the memory usage enough to allow another resize operation to start immediately after the fast-forward resize completes.

Fusl avatar May 17 '25 07:05 Fusl

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 71.39%. Comparing base (e53e048) to head (c16727d). Report is 6 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #2095      +/-   ##
============================================
- Coverage     71.43%   71.39%   -0.04%     
============================================
  Files           123      123              
  Lines         67030    67067      +37     
============================================
+ Hits          47881    47882       +1     
- Misses        19149    19185      +36     
Files with missing lines Coverage Δ
src/debug.c 53.62% <100.00%> (+0.04%) :arrow_up:
src/hashtable.c 82.42% <100.00%> (+0.11%) :arrow_up:
src/kvstore.c 95.02% <100.00%> (-0.02%) :arrow_down:
src/server.c 88.06% <100.00%> (-0.04%) :arrow_down:

... and 12 files with indirect coverage changes

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

codecov[bot] avatar May 17 '25 08:05 codecov[bot]

There is a leak reported in the address-sanitizer job. Maybe it's fixed by #2288 that I merged yesterday. Can you merge the latest unstable and see if the leak is gone?

My latest force push where I rebased on the unstable branch already includes that commit, not sure what's going on. I'll take a closer look at it.

Fusl avatar Jul 03 '25 10:07 Fusl

The address-santitizer job https://github.com/valkey-io/valkey/actions/runs/16039519166/job/45258225409?pr=2095#step:7:437 complains about memory allocated by hashtableCreate. This is after a failed assertion here: https://github.com/valkey-io/valkey/actions/runs/16039519166/job/45258225409?pr=2095#step:7:109

When an asserion fails inside a test case, it is aborted, so we don't free the hash table. I think the leak is not a problem. When we fix the failed assert, this leak should go away.

The unit test framework complains about a leak after each test suite because it only checks that all memory allocated by zmalloc so far has been freed, so these are false positives. It'd be better if it would check it per test suite.

zuiderkwast avatar Jul 03 '25 12:07 zuiderkwast

I hope this PR can prevent a memory leak from being reporting multiple times:

  • https://github.com/valkey-io/valkey/pull/2304

zuiderkwast avatar Jul 03 '25 12:07 zuiderkwast

This is after a failed assertion here: https://github.com/valkey-io/valkey/actions/runs/16039519166/job/45258225409?pr=2095#step:7:109

Last commit should fix that test case as well now.

Fusl avatar Jul 03 '25 17:07 Fusl