dragonfly
dragonfly copied to clipboard
track and expose lua memory usage
We integrate our own allocator into lua bindings (see mimalloc_glue
) but we do not track its allocations.
In some extreme cases it can be significant. Consider 20K-40K connections over k-threaded Dragonfly with interpreter_per_thread=300
running bullmq read requests.
End result: expose used_memory_lua
(same name as in valkey) via /metrics and via "info memory".
Another thing: interpreter_per_thread
may block a client. This, in turn creates a hidden bottleneck.
it is possible to identify the event of being blocked in Borrow()
function and expose this event it in "INFO STATS" / metrics.
This way, we will be able to identify this easily.
also count total number of interpreters
make sure we can flush lua memory and we do not have any leaks from lua
make sure we can flush lua memory and we do not have any leaks from lua
Re/ flush Lua memory:
- I played a bit with calling
lua_gc(LUA_GCCOLLECT)
directly. While it's a simpler implementation on our side (no need for a mutex,return_untracked_
, etc), it is not able to free as much memory as closing the instance and re-initializing it - A single (new, idle) Lua instance takes ~26kb of memory
- I could not get the Lua instances (running all sorts of simple scripts) to consume more than 70kb (per instance). GC kicks in and reduces consumption. It usually fluctuates between 30kb-60kb. This of course depends on the script at hand, I'm running some simple scripts and BullMQ load tests.
Re/ leaks:
I was not able to detect any leaks after playing a bit with BullMQ and manually written scripts. After running many scenarios, eventually SCRIPT FLUSH
will clear all instances and memory, going back to 26kb (per instance).
so maybe it's not lua. it could be that we are still missing a rather large contributor to backing heap usage.
or we have a memory leak
I'll try to reproduce a case in which there's a gap between RSS and other means of accounting memory. If I succeed, I can investigate further.