With SSD Tiering, Dragonfly process RSS exceeds configured maxmemory (540 GB) → OOM
Describe the bug
When running Dragonfly with SSD tiering and --maxmemory=540G, the Dragonfly process RSS (resident memory) grows beyond the configured maxmemory during bulk load (and subsequent queries). On my host (840 GB RAM), RSS rises well above 540 GB (e.g., ~700 GB), and the process is eventually killed by OOM, even though a 7–9 TB NVMe SSD is available for tiered storage.
To Reproduce
-
Start Dragonfly with SSD tiering:
/home/ubuntu/dragonfly --logtostderr --cache_mode=false --tiered_experimental_cooling=false \ --dbnum=1 --port=6379 --logbuflevel=-1 --conn_use_incoming_cpu=true \ --maxmemory=540G --masterauth=${DF_PASSWORD} --requirepass=${DF_PASSWORD} \ --break_replication_on_master_restart=true \ --tiered_offload_threshold=0.25 \ --tiered_prefix /mnt/localDiskSSD/dfssd \ --dir=/mnt/localDiskSSD/backup \ --cluster_mode=emulated --lock_on_hashtags --interpreter_per_thread=128 -
Populate data:
redis-cli DEBUG POPULATE 500000000 key 4096 -
Observe process RSS vs. maxmemory during/after load:
PID=$(pidof dragonfly) ps -o pid,rss,vsz,cmd -p $PID grep -E 'VmRSS|VmSize' /proc/$PID/status # (optional) compare with Dragonfly’s internal counters redis-cli INFO MEMORY | egrep "used_memory_human|maxmemory_human" -
After crossing the threshold, RSS grows beyond
maxmemoryand the service is terminated by OOM.
Expected behavior
With SSD tiering enabled, the Dragonfly process RSS should remain at or below the configured maxmemory (allowing for reasonable overhead), offloading eligible values to SSD to avoid OOM.
Actual behavior
Process RSS exceeds maxmemory by a large margin during/after loading, leading to OOM despite ample free SSD capacity at --tiered_prefix. (In my runs, used_memory also floats above the cap.)
Screenshots / Logs
-
Please attach:
-
Output of:
-
INFO MEMORYaround the event:redis-cli INFO MEMORY | egrep "used_memory|maxmemory" # Memory used_memory:521838494624 used_memory_human:486.00GiB used_memory_peak:522897895552 used_memory_peak_human:486.99GiB fibers_stack_vms:59866960 fibers_count:919 used_memory_rss:715525791744 used_memory_rss_human:666.38GiB used_memory_peak_rss:522897895552 maxmemory:579820584960 maxmemory_human:540.00GiB used_memory_lua:0 object_used_memory:481883549696 type_used_memory_string:481883549696 table_used_memory:33876148144 prime_capacity:880802160 expire_capacity:107520 num_entries:509999989 inline_keys:509999988 small_string_bytes:0 pipeline_cache_bytes:0 dispatch_queue_bytes:0 dispatch_queue_subscriber_bytes:0 dispatch_queue_peak_bytes:306642 client_read_buffer_peak_bytes:16414208 tls_bytes:26056 snapshot_serialization_bytes:0 commands_squashing_replies_bytes:0 psync_buffer_size:0 psync_buffer_bytes:0 cache_mode:store maxmemory_policy:noeviction replication_streaming_buffer_bytes:0 replication_full_sync_buffer_bytes:0 -
Directory size at the tiered path:
du -sh /mnt/localDiskSSD/dfssd 2.9T /mnt/localDiskSSD
-
Environment
-
OS: Ubuntu 20.04
-
Kernel: 6.5.0-1018-gcp 18-Ubuntu
-
Dragonfly Version: 1.32.0
-
Hardware:
- RAM: 840 GB
- SSD for tiering: local NVMe (LSSD), ~9 TB available
Reproducible Code Snippet
# 1) Start DF as above (tiering + 540G maxmemory)
# 2) Populate with 500M keys of ~4KB values
redis-cli DEBUG POPULATE 500000000 key 4096
# 3) Observe process RSS vs cap
PID=$(pidof dragonfly)
watch -n1 "ps -o pid,rss,vsz,cmd -p $PID; echo ----; redis-cli INFO MEMORY | egrep 'used_memory_human|maxmemory_human'"
Additional context
- Total dataset: ~1.7 TB (actual).
- Tiered path:
/mnt/localDiskSSD/dfssdon a 7–9 TB NVMe; disk IOPS/bandwidth were idle/available. - Despite tiering being configured, RSS rises above the 540 GB cap and the process OOMs.
- Post FlushALL command, data didn't get removed from the disk!
@romange, could you please review this and advise on any workarounds that would help us avoid the problem?
I am sorry we are busy with other tasks.
I am sorry we are busy with other tasks.
thanks for letting us know. We'll keep an eye on this issue as it's of high interest to us.
@romange are you planning to come back to this bug soon? It makes the feature useless for us, as our goal with SSD tiering is to cap max memory while letting the store grow on disk.
what was the instance type on GCP? can you please repeat this experiment with version v1.35?
We fixed a few small bugs and added backpressure for writes, so the instance will be throttling them when reaching memory limits.
Please also set tiered_storage_write_depth to a few thousands at least (max concurrent writes setting). Also, DEBUG POPULATE is not optimized to work with tiering, as it uses a limited number of concurrent writes, so the total write performance will be quite low