model_server
model_server copied to clipboard
Memory leaks on infer requests
Describe the bug The server leaks memory on infer requests.
To Reproduce To reproduce
- Get the archive ov-leak-debug.tar.gz, it contains a load simulation script, docker-compose file to run the server, the models and the model_config.json, and a script that creates the models (for reference).
- Run the server with
docker compose up -d. - Install the dependencies
pip install -r requirements.txt. - Run the script
python generate_ovms_load.py {model_name} --n-workers 10 --n-threads 10with{model_name}beingstatic,dynamicordynamic-nms. - Check OVMS memory usage to see it creep up. It fluctuates for
static, creeps up slowly fordynamic, and grows rapidly fordynamic-nms. The memory usage doesn't go down even when the load is no longer applied.
Expected behavior The memory used by OVMS is constant (or stabilizes after some time).
Logs Doesn't apply, no explicit errors present.
Configuration
- OpenVINO Model Server 2023.1.d789fb785, OpenVINO backend 2023.1.0.12185.9e6b00e51cd
- Checked on 12th Gen Intel(R) Core(TM) i9-12900H and on AMD Ryzen 9 5950X 16-Core Processor
- Config and models included in the archive
@darkestpigeon Thank you for your report. We have tested the models and scripts from attachments. This is 4 days (with some breaks) overview (docker container memory usage + RPS):
How long was your testing workload?