Memory leak under high load
Self Checks
- [x] I have read the Contributing Guide and Language Policy.
- [x] This is only for bug report, if you would like to ask a question, please head to Discussions.
- [x] I have searched for existing issues search for existing issues, including closed ones.
- [x] I confirm that I am using English to submit this report, otherwise it will be closed.
- [x] 【中文用户 & Non English User】请使用英语提交,否则会被关闭 :)
- [x] Please do not modify this template :) and fill in all the required fields.
Dify version
v1.10.1-fix.1
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
Run three consecutive 100-thread tests after restarting
✔️ Expected Behavior
memory release
❌ Actual Behavior
Memory usage increases during testing and is not released.
Hi @VolXD! I'm Dosu and I’m helping the dify team.
Based on the code and recent changes, the most likely cause of the memory leak under high load is unbounded event buffering in the EventManager component. Events are collected and stored in a list but are never explicitly cleared after processing, which can cause memory usage to grow and not be released, especially during high-concurrency tests. The dispatcher and workflow orchestration also coordinate event processing but do not clear the buffered events, so memory can accumulate if the workload is sustained or if events are retained indefinitely [EventManager source].
To confirm this, you can monitor the size of the event buffer during and after your tests, or use memory profiling tools (like tracemalloc or objgraph) to check for lingering event objects. Heap dumps after test completion can also help pinpoint which objects are not being released.
As a mitigation, consider restarting the service between heavy test runs, limiting concurrency, or patching the code to clear the event buffer after execution completes. If you can, try instrumenting the code to log the length of the _events list in EventManager before and after tests.
If you have logs, heap snapshots, or can share more details about which components are consuming the most memory, that would help narrow down the root cause further.
To reply, just mention @dosu.
How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other
same behavior in v1.8.1
That confirms the memory leak is a longstanding issue and not specific to recent versions. The root cause is likely the unbounded event buffering in the EventManager, which collects events in a list but never clears it after processing—so memory usage keeps growing under high load and isn't released after tests are done (see code).
To help confirm this, you could instrument the code to log the length of the _events list before and after your tests, or use memory profilers like tracemalloc or objgraph to see if event objects are lingering. As a temporary workaround, restarting the service between heavy test runs or limiting concurrency may help until a code fix is available.
If you can share profiling results or stats on the event buffer, that could help maintainers prioritize a fix.
To reply, just mention @dosu.
How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other
docker inspect -f '{{.State.Pid}}'
sudo perf mem -p
same behavior in v1.8.1
Please upgrade to the latest version.
Regarding the issue @VolXD , please provide more details if possible